Data Mining / Analytic Publications News, Jul 2013
Features (18) | Software (9) | Courses, Events (25) | Jobs | Academic | Competitions (3) | Publications (32) | News Briefs (5) Top KDnuggets tweets, Read more »
- Top KDnuggets tweets, July 29-30: John Tukey coined "Software" and "bit"; Words matter: Adding "terrorism" - Jul 31, 2013.
John Tukey, a great statistician, coined the words "Software" and "bit" ; Words matter: Adding "terrorism" to a poll question makes 10% more people approve NSA; MIT prof. is working on Quantum Machine Learning; 4 Common Stats Errors to avoid: 1. Base Rate Fallacy 2. Extrapolation 3. Correlation != Causation
- Text mining for science and technology: citation and discovery - Jul 31, 2013.
This paper addresses 3 complementary components of text mining: Citation scientometrics, seminal literature reviews, and literature-related discovery and innovation.
- KDnuggets 13:n18, Analytics Education Poll; Public data sites; Data Mining "Nobel" - Jul 31, 2013.
Latest analytics/data mining news, including Features (8) | Software (4) | Webcasts (2) | Courses, Events (5) | Meetings (1) | Jobs (14) | Academic (2) | Competitions (2) | Publications (6) | Tweets (6) | NewsBriefs (4) | CFP (11)
- Book: Data Clustering: Algorithms and Applications - Jul 29, 2013.
The chapters are carefully constructed to cover the area of clustering comprehensively with up-to-date surveys, making this book accessible to beginning data scientists and analysts.
- McKinsey eBook (free): Big Data, Analytics, and the Future of Marketing and Sales - Jul 29, 2013.
This ebook from McKinsey explores the business opportunities, company examples, and organizational implications of Big Data and advanced analytics.
- Top KDnuggets tweets, July 26-28: Statistical Data Analysis in Python and Pandas;Julia - a high-level scientific language - Jul 29, 2013.
Statistical Data Analysis in Python and Pandas, SciPy2013 Tutorial; A Beginner look at Julia - a high-level scientific language; Statisticians are envious, asking "Aren't We Data Science?" No; Intuitive Classification and Clustering using kNN and Python, well explained
- 5 Roles You Need on Your Big Data Team - Jul 27, 2013.
Getting value from Big Data requires also paying enough attention to people, and is not just about hiring the best talent. Also very important is identifying the roles the companies really need.
- Top KDnuggets tweets, July 24-25: Big collection of data sites, services; July 31 webinar: Collaborative Filtering - how to with R - Jul 26, 2013.
Must fave! Excellent collection of sites, services, marketplaces, and APIs for data; July 31 webinar: Collaborative Filtering, turn your visitors into customers; Want many #BigData infographics in one place ? Here is Big board on Pinterest; Kaggle and NLP Logix Chief Scientist resolve dispute from flight prediction contest
- LionBook Chapter 5: Mastering generalized linear least-squares - Jul 24, 2013.
After reading this chapter you are expected to improve from a casual modeler to a professional least-squares guru. Losing accuracy is not a weakness but a strength, an opportunity to create more powerful models by simplifying the analysis.
- Top KDnuggets tweets, July 22-23: The Rise of DIY Data Scientist: most of Kaggle competition w; An excellent introduction to MapReduce and Hadoop in Courser - Jul 24, 2013.
The Rise of DIY Data Scientist: most of Kaggle competition winners took Machine Learning on Coursera; An excellent introd to MapReduce and Hadoop; How Text Analysis found that J. K. Rowling was the author of "Cuckoos Calling"; Overview: Data Science with Hadoop, by Hortonworks
- Top KDnuggets tweets, July 19-21: All R packages and manuals, searchable; Data Science + Gamification + Online training = Datamind - Jul 22, 2013.
R documentation - all R packages and manuals, searchable; Data Science + Gamification + Online training = DataMind; How to meaningfully use Twitter Analytics, Facebook Insights; Tableau Online, now offers Visual Analytics in the Cloud, connects live to 150 sources
- CIO 10 Top Big Data Startups - Jul 21, 2013.
The final ranking is based on reader votes, but also big-name end users, VC funding, the management team and market positioning.
- Top KDnuggets tweets, July 17-18: Good tutorial: Machine Learning on Big Data; The Amazing 3D Topography of Tweets - Jul 19, 2013.
Good tutorial (93 slides): Machine Learning on #BigData; Very cool: The Topography of Tweets: Amazing 3D interactive viz; What do data scientists and data miners listen to?; Data Science 101 - Five data preparation mistakes to avoid
- Top KDnuggets tweets, July 15-16: Unix commands for data mining; SW engineering principles for data science - Jul 18, 2013.
Useful Unix commands for data mining; SW engineering principles every data scientist should know; The Big Data Job Boom; The more confident an expert is in prediction ...
- Getting Started with Amazon Redshift - Jul 17, 2013.
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service. This step-by-step, practical guide to the world of Redshift teaches you how to load, manage, and query data on Redshift.
- KDnuggets 13:n17, Analytics Education? KDnuggets Summer reading list; NSA and cat videos cartoon - Jul 17, 2013.
Latest analytics/data mining news, including Features (10) | Software (2) | Webcasts (3) | Courses, Events (3) | Meetings (4) | Jobs (4) | Academic (4) | Competitions (1) | Publications (5) | Tweets (6) | NewsBriefs (1) | CFP (11)
- The Hymn of Acxiom - Sacred music the age of Big Data? - Jul 16, 2013.
A singer who spent a summer as an intern at Acxiom, working on databases of consumer profiles, wrote The Hymn of Acxiom - "let our formulas find your soul".
- Top KDnuggets tweets, July 12-14: DataMind: FREE Online Interactive Learning Platform for R; From Whale Calls to Dark Matter: Competitive Data Science - Jul 15, 2013.
DataMind: FREE Online Interactive Learning Platform for R; From Whale Calls to Dark Matter: Competitive Data Science with R and Python; Why Anthropology and data science need each other - thought-provoking ; Great infographic: Becoming a Data Scientist - Curriculum via Metromap
- Top KDnuggets tweets, July 10-11: A subway map of 10 important skills; Applied Data Science Columbia textbook - Jul 12, 2013.
Great chart -> A subway map of 10 important skills on the Road to Data Scientist; Applied Data Science Columbia textbook, for techies; Free ebook: Intro to Data Science, by J. Stanton, a gentle intro; 10 Coolest #BigData Startups of 2013
- LionBook Chapter 4: Linear Models - Jul 12, 2013.
The deep reason why linear models work well is the smoothness underlying many physical phenomena. So when confronted with a difficult problem, try linear equations first and you are likely to either solve it or at least come up with a workable approximation.
- Top KDnuggets tweets, July 8-9: Data Scientists vs. Data Engineers; Understanding Hadoop MapReduce - Jul 10, 2013.
Data Scientists vs. Data Engineers - what are their roles in the organization; For aspiring Big Data Scientists: Understanding Hadoop MapReduce; Data Science and Big Data Analytics - Free Course Module; Great Google tools for analysis & discovery
- LionBook Chapter 3: Learning requires a method - Jul 10, 2013.
Real learning is associated with extracting the deep and basic relationships in a phenomenon, with summarizing with short models a wide range of events, with unifying different cases by discovering the underlying explanatory laws. This chapter explains the bias-variance dilemma.
- Analyzing the Analyzers - A Survey of Data Scientists - free ebook - Jul 9, 2013.
Interesting patterns emerge from clustering their skills, activities, education, and self-identification. The report combines analytics results with insights and argues for the clearer communication around roles, teams, and careers.
- MassTLC Big Data Summit Report: How Data Analytics Will Transform Our World - Jul 9, 2013.
The MassTLC summit discussed connected cities, connected health, personalized medicine, and data mashups - how to derive customer value.
- Top KDnuggets tweets, July 5-7: Good #BigData summer reading list; Data Mining and Jay-Z - Jul 8, 2013.
Good #BigData summer reading list; Data Mining and Rap? Jay-Z's new album is a massive data-mining operation; 7 #BigData Definitions, also: Data is Big when its size becomes part of the problen; Search theory and #BigData: Applying Bayesian math that sank U-boats to intelligence
- Top KDnuggets tweets, July 3-4: ebook: Instant Weka How-to; Here is a tasty #BigData app - Food Genius - Jul 5, 2013.
ebook: Instant Weka How-to; Here is a tasty #BigData app! Food Genius analyzes data, trends from 22M restaurants items; How economist Susan Alley mastered Big Data, auctions to help shape the web; It feels like Friday ! Data Scientist cartoons, including LOLCATS, #BigData, and God
- ebook: Instant Weka How-to - Jul 4, 2013.
A practical guide with examples and applications of programming Weka in Java. Start with the basics and learn how to include Weka machinery in your Java application.
- Top KDnuggets tweets, July 1-2: What makes people click; Analyzing the Analyzers - a map of data scientists - Jul 3, 2013.
Machine learning on 2 million news stories creates a model of what makes people click; Analyzing the Analyzers - a survey maps the different types and skills; This is important! Social Networks send buyers into Brick and Mortar Stores; 30 upcoming Meetings, Conferences on Analytics, Big Data, Data Mining, and Knowledge Discovery
- KDnuggets 13:n16, New SIGKDD Chair; Stanford Learning Analytics Online; Data Science Jobs - Jul 3, 2013.
Latest analytics/data mining news, including Features (7) | Software (1) | Webcasts (1) | Courses, Events (2) | Meetings (2) | Jobs (13) | Academic (1) | Publications (4) | Tweets (3) | NewsBriefs (9) | CFP (6)
- Caltech Symposium: Emerging Science of Big Data Visualization - Jul 2, 2013.
Computer scientists, artists, and designers gathered to discuss the emerging science of big-data visualization. Watch talks on Interactive Data Analysis, Reduction and Revelation, When Art and Analytics Overlap, Communicating Science to the Public, Visualizing Natural and Cultural Phenomena, and more.
- Top KDnuggets tweets, June 24-30: Getting started with different tools; Datameer 3.0: machine learning for Hadoop - Jul 1, 2013.
Practical Analytics - getting started with different tools: RapidMiner, R, Predixion; Datameer 3.0: first ever point-click machine learning functions for Hadoop; You can create Word documents with R2DOCX; What engineers learn at Facebook #BigData bootcamp
- IDC Forecasts Strong Growth for Business Analytics Software Market - Jul 1, 2013.
The worldwide business analytics software market is expected to grow at a 9.7% CAGR through 2017, according to a new forecast from IDC. In 2012 the market grew only 8.7%, much slower than 15% in 2011. Data warehousing sector grew faster in 2012 than BI/Analytics sector.