- Data Mining and Analysis: Fundamental Concepts and Algorithms, free PDF download (draft) - Sep 27, 2013.
New book by Mohammed Zaki and Wagner Meira Jr is a great option for teaching a course in data mining or data science. It covers both fundamental and advanced data mining topics, emphasizing the mathematical foundations and the algorithms, includes exercises for each chapter, and provides data, slides and other supplementary material on the companion website.
- Top KDnuggets tweets, Sep 25-26: ganalytics: Google Analytics with R; Getting a Free Data Science Education - Sep 27, 2013.
ganalytics: Google Analytics with R and tips to newcomers to help succeed in using R; Getting a Free Data Science Education, from Coursera, Caltech and MIT, plus free textbooks; Who says #BigData is not useful? Stanford researchers map user experience, beer preference; Nate Silver on Finding a Mentor, Not Settling in Your Career
- Birthdate of Predictive Analytics - Sep 26, 2013.
When is the Birthdate of Predictive Analytics? Is it Jan 15, 2003 in SPSS Presentation or earlier?
- Top KDnuggets tweets, Sep 23-24: Machine Learning to create video summaries; Google replaces MySQL with MariaDB - Sep 25, 2013.
Wow! New research uses Machine Learning to provide brief video digest; Google moves to MariaDB to replace MySQL (still open-source, but Oracle-controlled); Yottamine sets #BigData Machine Learning speed record with 640 cores; Facebook chases Google Deep Learning with new research group
- SDSC: Supercomputer Data Mining in San Diego - Sep 24, 2013.
I talk with Natasha Balac, Director of Predictive Analytics at San Diego Supercomputer Center about supercomputer data mining, Gordon, Hadoop, Data Mining Boot Camps, distinction between Data Science and Data Mining, Big Data hype, and more.
- KDnuggets 13:n23, Split on Big Data Hype Peak; Cartoon: Next Trend after Big Data - Sep 24, 2013.
Latest analytics/data mining news, including Features (8) | Software (3) | Webcasts (2) | Courses, Events (4) | Meetings (3) | Jobs (10) | Academic (3) | Competitions (1) | Publications (3) | Tweets (6) | NewsBriefs (3) | CFP (7)
- Web Crawling Social Media for Topic Shift Detection and Expert Spotting - Sep 23, 2013.
The best asset of an open source software is not price, but the community. This paper is a case study of analyzing community forums for Topic Shift Detection and Expert Spotting.
- Top KDnuggets tweets, Sep 20-22: Statistics vs Data Mining; Data Science for Business: What You Need to Know - Sep 23, 2013.
Statistics vs Data Mining: Statistics begins after data cleaning is done; Data Science for Business: What You Need to Know; The bursting of the #BigData bubble is imminent, says @mathbabedotorg; A book by Dorian Pyle, Data Preparation for Data Mining, free PDF
- Top KDnuggets tweets, Sep 18-19: Realtime Predictive Analytics w. RabbitMQ, scikit-learn; Peregrine, a fast competitor to Hadoop - Sep 20, 2013.
Advanced Data Science: Realtime Predictive Analytics Using RabbitMQ & scikit-learn; Peregrine, a competitor to Hadoop, is a FAST Map Reduce framework; IEEE ICDM 2013 Data Mining competition (on Kaggle): Expedia Personalized Sort; Import.io gets relevant data from Web pages to Spreadsheets, makes data extraction easy
- Gregory Piatetsky-Shapiro on Big Data Education in Big Data Innovation Magazine - Sep 19, 2013.
I talk to George Hill of Innovation Enterprise about the Big Data skills gap and what companies and universities are doing about it.
- Purple People: conundrum of finding business expertise among Data Scientists - Sep 19, 2013.
Purple is the blend of Red (Business acumen) and Blue (Analytical abilities) and it is a very desired combination for data scientists.
- LIONbook Chapter 9: Neural networks, shallow and deep - Sep 19, 2013.
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free for personal and non-profit usage. Chapter 9 looks at Neural networks, shallow and deep.
- Top KDnuggets tweets, Sep 16-17: Using Twitter Data with R, updated; Skills needed for machine learning jobs - Sep 18, 2013.
Using Twitter Data with R #rstats, updated for API changes; Skills needed for machine learning jobs; Improving #BigData: Hadoop 2.0 will be faster, better support real-time data ; WekaMOOC: Data Mining with Weka, online course
- Top KDnuggets tweets, Sep 13-15: Twitter Data Analytics, free download; Microsoft looks for Big Data Science projects to host in Azure - Sep 16, 2013.
Top news, Sep 8-14: Twitter Data Analytics - free book download; Microsoft looks for Big Data Science projects to host in Azure; Markov Models & Predictive Analytics with CATS; Vinod Khosla: Data Science will do more for Medicine than All Biological Science
- Top KDnuggets tweets, Sep 11-12: Big Data University free courses online; "Unstructured" data is NOT really unstructured - Sep 13, 2013.
IBM Big Data University, offers many free courses online on Hadoop, Analytics; "Unstructured" data like text is NOT really unstructured - it has a very complex structure; Data Scientists: Key "Programmers" in the Convergence of #BigData, Cloud, Streaming and In; Predictive Modeling Misconceptions: Modeler has to be a PhD, Black box model is OK
- Top KDnuggets tweets, Sep 9-10: Build your own Twitter Sentiment Analysis; How Google Applies #BigData To Know You - Sep 11, 2013.
Very useful: How to build your own Twitter Sentiment Analysis Tool with Datumbox API; How Google Applies #BigData To Know You - very good infographic; Python leads in Open-Source Code Quality, with 5 defects/1 Million lines of code; Yummy! Star Data Scientist @hmason on how to find NYC optimal cheeseburger
- KDnuggets 13:n22, Poll: Has Big Data Reached the Hype Peak? Twitter Analytics: free ebook - Sep 10, 2013.
Latest analytics/data mining news, including Twitter Analytics: free ebook, Data Mining competition lessons, Features (12) | Software (2) | Webcasts (4) | Courses, Events (3) | Meetings (2) | Jobs (4) | Academic (4) | Competitions (3) | Publications (6) | Tweets (6) | News Briefs (1) | CFP (6)
- Top KDnuggets tweets, Sep 6-8: Reasons for the Data Science Mania; Learning Text Mining with R by mining Shakespeare - Sep 9, 2013.
Reasons for the Data Science Mania: Faster hardware, Open Source, Hadoop, R, Cloud, M; Learning Text Mining with R & tm by mining the complete works of William Shakespeare #rsta; Leading Data Scientist Claudia Perlich: Data scientist is to 2013 as "geek" was to 2003 an; My answer to Where can I find large datasets open to the public? http://qr.ae/NTd1O
- LIONbook Chapter 8: Specific nonlinear models - Sep 6, 2013.
The LIONbook on machine learning and optimization, written by co-founders of LionSolver software, is provided free on a chapter by chapter basis for personal and non-profit usage. Chapter 8 looks at nonlinear models, from logistic regression to LASSO.
- Top KDnuggets tweets, Sep 4-5: OpenML searches 576K machine learning experiments; R top language for data science, data mining - Sep 6, 2013.
OpenML lets you search 576K machine learning experiments on 130 datasets; Poll: R top language for data science, data mining three years running; Statsoft online Statistics Textbook (free) - great resource for Data Mining, Data Science; Great collection of Data & Visualization Mistakes to avoid: bad pies, misleading bar charts
- Big Data Journal: Call For Papers - Sep 5, 2013.
Big Data, a highly innovative, peer-reviewed journal, is seeking high-quality, innovative submissions: original articles, reviews, commentaries and perspectives, brief reports, point/counterpoint articles, and letters to the editor.
- Evolution from Data Mining to Big Data, Data Science Competitions Lessons, and Public Data Platforms - Sep 4, 2013.
My presentation looks at the Data Mining/Data Science/Big Data evolution, reviews lessons from KDD Cup 1997, Netflix Prize, and Kaggle, presents a big list of Public Data Marketplaces and Platforms, and examines Big Data Hype.
- Top KDnuggets tweets, Sep 2-3: How Big Data/Data Mining improve education; Sentiment Analysis Presentations - Sep 4, 2013.
How Big Data, Educational Data Mining, and Learning Analytics can improve education; Sentiment Analysis Symposium 2013 Presentations publicly available; Ralf Mikut updates his big table of Data Mining Tools; More on R and Python (and SQL) and a small affinity between R and Python users
- Privacy and Big Data: Stanford Online Symposium - Sep 3, 2013.
Experts look at Privacy and Big Data, Fairness of Classification Algorithms, 3 Paradoxes of Big Data, Buying and Selling Privacy, Prediction, Preemption, Presumption, and other topics in Big Data vs Privacy debate.
- Sentiment Analysis Symposium 2013 Presentations available - Sep 2, 2013.
Watch presentation videos and slide decks from the 2013 Sentiment Analysis Symposium (freely available) and help decide on future conferences in the field by taking a short survey.
- Top KDnuggets tweets, Aug 30 - Sep 1: Twitter Data Analytics, free book download; ZMap scans internet in 45mins - Sep 2, 2013.
Twitter Data Analytics - free book download ; ZMap, open source software, scans the Entire internet in 45mins; 10 tips for a successful predictive analytics project; To Go from Big Data to Big Insight, start with a Good Visualization