Data Mining, Data Science, and Analytics
Publications – Oct 2013
All (109) | News, Software (30) | Courses, Events (28) | Jobs | Academic | Publications (32) LIONbook Chapter 12: Top-down clustering: K-means - Oct Read more »
LIONbook Chapter 12: Top-down clustering: K-means - Oct 31, 2013.
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free for personal and non-profit usage. Chapter 11 looks at Top-down clustering: K-means.Top KDnuggets tweets, Oct 28-29: The Mathematical Shape of Big Science Data; Great Guide to NoSQL - Oct 30, 2013.
The Mathematical Shape of Big Science Data - new calculus of network analysis; Great read: HP Guide to NoSQL explains CAP theorem, MapReduce, new RDBMS systems; 10 rules for reproducible computation research (and data science); Strata #BigData Conference + Hadoop World 2013 in NYC - watch keynotes liveKDnuggets 13:n26, Big Data effect on Data Science is minor; Online Analytics education; Future of Statistics - Oct 30, 2013.
Latest analytics/data mining news, including Features (8) | Software (2) | Webcasts (2) | Courses, Events (3) | Meetings (3) | Jobs (6) | Academic (2) | Competitions (1) | Publications (2) | Tweets (4) | NewsBriefs (7) | CFP (4) .Top KDnuggets tweets, Oct 23-27: Free Book: Advanced Text Mining; SAS CEO Jim Goodnight says Big Data is hype - Oct 28, 2013.
Free Book: Theory and Applications for Advanced Text Mining; SAS CEO Jim Goodnight says #BigData hype manufactured by analysts and media; Big Data is not enough for better decisions - you need to connect diverse data; 0xdata releases H2O, open-source fast machine learning engine for #BigDataGreat Debate: Design vs. Math - Oct 27, 2013.
Math informs; design compels. Which matters more? A well-designed collection of flawed information-or an opaque, hard-to-parse, but unerringly accurate model?Free Book: Theory and Applications for Advanced Text Mining - Oct 27, 2013.
This book has 9 chapters introducing text mining techniques, including Relation Extraction, ontology learning using Word Net, and automatic compilation of travel information from texts.Top KDnuggets tweets, Oct 18-22: Automating the Black Art of Deep Learning; Top 10 Ways You Know You're a Data Scientist - Oct 23, 2013.
Automating the Black Art and "Oral traditions" of Deep Learning; Top 10 Ways You Know You're a Data Scientist - very funny; LIONbook Chapter 11: Democracy in machine learning - how to combine different modelsTop KDnuggets tweets, Oct 16-17: Data Science Toolkit on AWS Marketplace; How to Interview a Data Scientist - Oct 18, 2013.
Data Science Toolkit on AWS Marketplace; LinkedIn Top Scientist @dtunkelang on How to Interview a Data Scientist; Intel: Applied Data Scientist, Graph Analytics, Big Data Analytics; BBVA Innova Data Mining Challenge, 1st time bank releases anonymized card transactionLIONbook Chapter 11: Democracy in machine learning - combining models - Oct 18, 2013.
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free for personal and non-profit usage. Chapter 11 looks at Democracy in machine learning - how to combine different models in flexible, creative and effective ways.Top KDnuggets tweets, Oct 14-15: Tutorial: The Naive Bayes Text Classifier; Quantum Computers and Machine Learning - Oct 16, 2013.
Tutorial: The Naive Bayes Text Classifier; How Quantum Computers and Machine Learning Will Revolutionize #BigData; See how easy it is to find patterns in random data; Applied Data Science - free, self-guided online courseKDnuggets 13:n25, 7 Steps for Learning Data Mining; New Poll: Big Data Science?; Cognitive Mining - Oct 16, 2013.
7 Steps for Learning Data Mining, my exclusive interview on Cognitive Mining, and latest analytics/data mining news, including Features (8) | Software (1) | Webcasts (2) | Courses, Events (2) | Meetings (1) | Jobs (2) | Competitions (2) | Publications (4) | Tweets (3) | NewsBriefs (2) | CFP (7) .LIONbook Chapter 10: Statistical Learning Theory and Support Vector Machines (SVM) - Oct 15, 2013.
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free for personal and non-profit usage. Chapter 10 looks at Statistical Learning Theory and Support Vector Machines (SVM).LinkedIn InMaps - Visualize your network - Oct 15, 2013.
InMaps provides a visual representation of your professional Linkedin universe, and allows you to better understand your professional ties and the relationship patterns.Exclusive: Cognitive Mining, Data Mining, and Statsoft - part 2 - Oct 14, 2013.
Cognitive Mining and Data Mining, data scientists mission, Big Data, privacy, and advice for beginning data scientists - Part 2 of the KDnuggets exclusive interview with StatSoft VP Dr. Thomas HillExclusive: Cognitive Mining and Data Mining - Interview with Dr. Thomas Hill - Oct 14, 2013.
What is the relationship between Cognitive Mining and Data Mining? I discuss this, what makes StatSoft different, achieveing user satisfaction, Big Data and Privacy with StatSoft VP Dr. Thomas Hill.Top KDnuggets tweets, Oct 11-13: Python for machine learning, Data Science; Google new R package for Big Data - Oct 14, 2013.
Python extensions for machine learning and Data Science; Google releases new R package HistogramTools for #BigData; Top news, Oct 6-12: 3 Free Big Data books on Amazon; 7 Steps for Learning Data Mining; Twitter Analytics: A Beginner's GuideAdvent of Predictive Analytics in China - Oct 14, 2013.
The CEO of AsiaAnalytics examines how China current economic development turns local corporations into the world largest data guzzlers.KnowledgeMiner Self-learning Model for Global Warming - Oct 12, 2013.
What drives Global Warming? The KnowledgeMiner Self-learning Model from Apr 2011 matches observations at 73% accuracy, while IPCC forecast at 10%.Top KDnuggets tweets, Oct 9-10: 7 Steps for learning Data Mining and Data Science; 5 Data Science Deadly Sins - Oct 11, 2013.
Gregory Piatetsky outlines 7 Steps for learning Data Mining and Data Science; 5 Data Science Deadly Sins: Cherry Picking, Confirmation Bias, Data Selection Bias ...; Great job for data scientist who loves to travel; New algorithm mines your Twitter stream, finds most significant eventsIterative-Incremental Approach for BI Implementation - Oct 9, 2013.
Despite big investments, BI projects often fail to deliver, and traditional waterfall methods have proven ineffective. The iterative approach proposed here outlines how to break large projects into more manageable pieces, and uses the idea of a "parking lot" of value-adding features.3 Free Big Data books from O'Reilly on Amazon - Oct 9, 2013.
Free ebooks from O'Reilly Media, available on Amazon, look at Big Data disruptive possibilities, emerging architecture, tools, applications, and trends, with a special section on health care.Top KDnuggets tweets, Oct 7-8: Data Scientists need to be Polyglots; NaSent uses recursive deep learning - Oct 9, 2013.
Data Scientists need to be Polyglots; NaSent, new algorithm from Stanford, uses recursive deep learning; Less is more: How to Simplify & Sexify your Graphs; RAW: A Data Visualization toolKDnuggets 13:n24, Big Data != ALL Data; Leo Breiman on Two Cultures; KDD-2013 Videolectures - Oct 8, 2013.
Latest analytics/data mining news, including Features (10) | Software (3) | Webcasts (4) | Courses, Events (4) | Meetings (2) | Jobs (7) | Academic (4) | Competitions (1) | Publications (4) | Tweets (5) | NewsBriefs (8) | CFP (6) .Big Data does not mean we have ALL the data - Oct 7, 2013.
Does Big Data imply "You have collected all there is - all the data there is about a phenomenon". I strongly disagree with this quote from Viktor Mayer-Schonberger and Kenneth Cukier book on Big Data - here is my letter to the editor.Top KDnuggets tweets, Oct 4-6: Data science source code snippets; To Hadoop or Not to Hadoop? - Oct 7, 2013.
Sample source code for various data science tasks and projects; To Hadoop or Not to Hadoop? Questions to determine if you need Hadoop; Big Data experts get big salaries - $115K on average; Data Mining reveals the emotional differences in emails written by Men and WomenCircle of Trust and Google Plus - Oct 6, 2013.
Circle of Trust measures how asymmetric is your Google+ relationship network, and we give you ways to visualize it with D3.Rexer Analytics 2013 Data Miner Survey Highlights - Oct 5, 2013.
Top 5 most used tools were R (used by 70% of data miners), IBM SPSS Statistics, Rapid Miner, SAS, and Weka, while STATISTICA, KNIME, SAS JMP, IBM SPSS Modeler, and RapidMiner had the the highest satisfaction. Big Data is actually used only in a small fraction of projects.CIO Review 20 Most Promising Big Data Companies - Oct 5, 2013.
CIO Review selects 20 most promising Big Data companies, from Actian to Zementis, that have achieved significant momentum and will rise above the rest.To Hadoop or Not to Hadoop? - Oct 4, 2013.
Hadoop is very popular, but is not a solution for all Big Data cases. Here are the questions to ask to determine if Hadoop is right for your problem.Top KDnuggets tweets, Oct 1-3: Data Science with R: Rattle; KDD 2013 videolectures - Oct 4, 2013.
Data Science with R: Getting Started with Rattle - a survival guide; KDD 2013 videolectures: the top researchers in Data Mining, Data Science; Statistical Modeling: The Two Cultures, by Leo Breiman; Social Media Analytics, free e-book, an overview of theory, applications, and economicsSocial Media Analytics, free e-book - Oct 3, 2013.
This book gives Social Media Analytics overview, techniques, theory and applications, examines the economic impact of social networks and appropriate analytics methods.Statistical Modeling: The Two Cultures, by Leo Breiman - Oct 1, 2013.
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown - read the full paper.Top KDnuggets tweets, Sep 27-30: Data Mining and Analysis, free book (draft) download; Random Forests Algorithm - why it works - Oct 1, 2013.
New Book: Data Mining and Analysis: Fundamental Concepts and Algorithms, free PDF dow; Random Forests Algorithm - what is it, why does it work so well; Penn researchers use Facebook data to predict users age, gender, personality; Google Hummingbird is a completely new search algorithm and incredibly no one noticed