KDnuggets™ News 15:n08, Mar 11: 7 common Machine Learning mistakes; Statistical Reasoning
7 common mistakes when doing Machine Learning; 10 Predictive Analytics Influencers; Kaiser Fung on Why Statistical Reasoning is more important than Number Crunching; The Elements of Data Analytic Style - checklist; KDD-2017.
Features | Software | Opinions | Interviews | Reports | News | Webcasts | Courses | Meetings | Jobs | Academic | Tweets | Publications | CFP | Quote
Features
- 7 common mistakes when doing Machine Learning - Mar 7, 2015.
In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.
- 10 Predictive Analytics Influencers You Need to Know
- Mar 5, 2015.
A list of Predictive Analytics Influencers based on Twitter activity around "#PredictiveAnalytics" and "Predictive Analytics": Gregory Piatetsky, Vineet Vashishta, Aki Kakko and more. - Interview: Kaiser Fung, NYU on Why Statistical Reasoning is more important than Number Crunching - Mar 5, 2015.
We discuss why every individual should care about statistics, inspiration behind the book Numbersense, teaching statistics as liberal arts, Junk Charts blog, advice and more.
- The Elements of Data Analytic Style - checklist - Mar 4, 2015.
Jeff Leek book "Elements of Data Analytic Style" had a rocket launch, thanks to author course on Coursera. The book includes a useful checklist that can guide beginning data analysts or serve for evaluating data analyses.
- Top stories in February: 10 things statistics taught about big data; Gartner Analytics MQ: gainers and losers - Mar 5, 2015.
10 things statistics taught us about big data analysis; Data Science's Most Used, Confused, and Abused Jargon; Gartner Analytics MQ: gainers and losers; My Brief Guide to Big Data and Predictive Analytics for non-experts.
- PAW San Francisco: The Pentagon of Data Analytics Events, Mar 29 - Apr 2 - Mar 10, 2015.
In San Francisco this month, you'll find five data analytics events - join your peers by for case studies, profound keynotes, and build strong connections with data analytics professionals. Special KDnuggets discount.
Software (see also All Software )
- Wrangling Public Bike Share Data with The Free Trial of Trifacta - Mar 6, 2015.
A free trial of Trifacta is a good opportunity for data analysts to start wrangle the different shapes and sizes of data sets. We give an example of wrangling Bay Area Bike Share data to better understand biking around San Francisco.
Opinions (see also All Opinions for this month )
- Juergen Schmidhuber AMA: The Principles of Intelligence and Machine Learning - Mar 9, 2015.
Jurgen Schmidhuber, pioneer in innovating Deep Neural Networks, answers questions on open code, general problem solvers, quantum computing, PhD students, online courses, and the neural network research community in this Reddit AMA.
- Failing Optimally - Data Science's Measurement Problem - Mar 4, 2015.
Data science has a measurement problem. Simple metrics may not address complex situations. But complex metrics present myriad problems.
Interviews (see also All Interviews for this month )
- Interview: Slava Akmaev, Berg on Challenges in Transitioning Analytics to Clinical Utility - Mar 10, 2015.
We discuss Analytics use cases, challenges in relating molecular/clinical data to real-life outcomes, Healthcare Analytics trends and more.
- Interview: Slava Akmaev, Berg on Healthcare Transparency & Effectiveness using Big Data - Mar 9, 2015.
We discuss Big Data Analytics at Berg, making Healthcare effective through Big Data, impact of falling cost of DNA sequencing, Berg AI-Analytics Suite and more.
- Interview: Lei Shi, ChinaHR.com on Unraveling Insights from Unstructured Data - Mar 7, 2015.
We discuss challenges in leveraging Big Data, important attributes while profiling employers and job seekers, competitive landscape, desired skills in data scientists and more.
- Interview: Lei Shi, ChinaHR.com on Analytics behind the Perfect Match - Mar 6, 2015.
We discuss analytics at ChinaHR, matching job seekers and employers, traditional job fairs vs online recruitment, key metrics and analytical insights.
- Interview: Kaiser Fung, NYU on Why Ignoring Data Integrity is a Recipe for Disaster - Mar 4, 2015.
We discuss different levels of Data Integrity, logical fallacies in Analytics, measures to boost accountability, role for human intelligence in Analytics and relevance of OCCAM framework.
Reports (see also All Reports for this month )
- Strata + Hadoop World 2015 San Jose - Day 2 Highlights - Mar 10, 2015.
Strata + Hadoop World 2015 was a great conference, and here are key insights from some of the best sessions on day 2.
News (see also All News )
- KDD-2017, top conference on Data Mining, Data Science Research coming to Halifax, Canada - Mar 10, 2015.
The 2017 edition of KDD, the leading conference on Knowledge Discovery, Data Mining, and Data Science Research will be held in Halifax, Nova Scotia, Canada.
- Top stories for Mar 1-7: All Machine Learning Models Have Flaws; Analytics, Data Mining, Data Science professionals salary - Mar 9, 2015.
All Machine Learning Models Have Flaws; Analytics, Data Mining, Data Science professionals well compensated; 7 common mistakes when doing Machine Learning; Interviews with Ted Dunning (MapR) and Kaiser Fung (@junkcharts).
- Top /r/MachineLearning Posts, Mar 1-7: Stanford Deep Learning for NLP, Machine Learning with Scikit-learn - Mar 9, 2015.
This week on /r/MachineLearning, we have a new NLP-focused deep learning course from Stanford, an introduction to scikit-learn, visualization of music collections, an implementation of DeepMind, and NLP using deep learning and Torch.
- GISCUP: GIS-focused algorithm competition at ACM SIGSPATIAL GIS 2015 - Mar 6, 2015.
Route planner in real-time one of the most popular web GIS services in use today, and 2015 contest is to find shortest path under polygonal obstacles.
- Top /r/Machine Learning Posts, February: Automating Tinder, Jurgen Schmidhuber, and Shazam - Mar 5, 2015.
Automating Tinder with Eigenfaces, the elephant in the room of Machine Learning, the Jurgen Schmidhuber AMA, and Shazam's music recognition algorithm make up the top posts in the last month on /r/MachineLearning.
- Top /r/MachineLearning Posts, Feb 22-28: Jurgen Schmidhuber AMA and Machine Learning Done Wrong - Mar 4, 2015.
The Jurgen Schmidhuber AMA begins taking questions, machine learning done wrong, GPUs for deep learning, Google opens its native MapReduce capabilities, and Google publishes its DeepMind paper this week on /r/MachineLearning
Webcasts and Webinars (see also All Webcasts and Webinars )
- Upcoming Webcasts on Analytics, Big Data, Data Science - Mar 10 and beyond - Mar 9, 2015.
Data Wrangling and the Art of Big Data Discovery, Data Mining: Failure to Launch, The State of Hadoop Adoption, Addressing the Challenges of Data Variety, and more.
- Webinar: Data Mining: Failure to Launch [Mar 11] - Mar 9, 2015.
Learn how to get started with predictive modeling and overcome strategic and tactical limitations that cause data mining projects to fall short of their potential. Next webinar is Mar 11.
Courses (see also All Courses )
- Northwestern Online MS in Predictive Analytics - Mar 10, 2015.
Prepare for leadership-level career opportunities, learn from distinguished Northwestern faculty and industry experts, build statistical and analytic expertise. Application deadline Apr 15.
- Online Graduate Certificate in Business Analytics from Penn State - Mar 5, 2015.
This new online Graduate Certificate covers the entire life cycle of data and analytics-supportive decision-making and is built around the framework from the Institute for Operations Research and Management Sciences (INFORMS).
- Careers in Data Science and Business Analytics - NYU Seminar, Mar 28 - Mar 4, 2015.
In an upcoming one-day seminar by NYU School of Professional Studies, Kaiser Fung will explain how to shape a career in data science and business analytics. Date: March 28, 2015. Registration: Open.
Meetings (see also All Meetings )
- PAW San Francisco: The Pentagon of Data Analytics Events, Mar 29 - Apr 2 - Mar 10, 2015.
In San Francisco this month, you'll find five data analytics events - join your peers by for case studies, profound keynotes, and build strong connections with data analytics professionals. Special KDnuggets discount.
- OpenDataSciCon - Data Science for All: What Do You Need ? - Mar 10, 2015.
"All you need is knowledge." Open source is what paves the road to that knowledge and the Open Data Science Conference, Boston, May 30-31, 2015.
- NoSQL matters Paris, a conference for developers, architects and geeks, Mar 26-27 - Mar 10, 2015.
An opportunity to network with leading NoSQL experts from all over the world, enjoy mind blowing talks or simply have loads of fun hacking away with other participants in Paris, 26-27 Mar 2015.
- Upcoming March - August 2015 Meetings in Analytics, Big Data, Mining, Data Science - Mar 4, 2015.
Coming soon: Big Data Paris, TDWI Solution Summit, GigaOM Structure Data, Chief Data Strategy Forum, PAW San Francisco, Text Analytics World SF, Gartner BI and Analytics Summit, and 80+ more meetings/events.
Jobs (see also All Jobs )
- Alliant: SAS Programmer/Data Analyst, Direct Marketing/Credit - Mar 10, 2015.
Support our Analytics team with responsibility for data manipulation, model scoring and report generation.
- Civis Analytics: VP / Director of Applied Data Science - Mar 10, 2015.
An experienced analytics leader to guide a team of client-facing data scientists, who work with intelligent organizations in healthcare, media, education, and other domains, and build cloud-based tools to do Data Science better.
- VEIC: Energy Data Analysts - Mar 10, 2015.
Work closely with clients to develop and maintain analytical tools that derive useful information from data sources like Smart Meter, sub-meter data, etc. in support of efficiency projects. Apply by Mar 23.
- PeerIQ: Credit Quant - Mar 6, 2015.
Developing statistical and quantitative models to support prepayment, default and loss forecasting; Basel and CCAR; pricing and valuation; and other economic calculations.
- Criteo: Senior Data Scientist, Machine Learning - Mar 5, 2015.
Outstanding machine learning research scientists to contribute to Criteo in driving the future of ad targeting, personalization, content extraction, content matching and other prediction problems.
Academic and Research positions (see also All Academic positions )
- ICS (Prague): AVAST Fellowship in machine learning and data science - Mar 9, 2015.
Contribute to a range of projects including machine learning applications in computer security and antivirus software using big data analysis of behavioral data.
Top Tweets
- Top KDnuggets tweets, Mar 2-8 - Mar 9, 2015. How #PayPal uses #DeepLearning and detective work to fight #fraud;
Beginning #deeplearning with 500 lines of Julia;
Processing frameworks for Hadoop and 6 categories in the #Hadoop Ecosystem;
KDnuggets Poll results: #Analytics, #DataMining, #DataScience salary income by region.
Publications
- Chapter Download from "Data Mining Techniques" - Mar 9, 2015.
Download this chapter from "Data Mining Techniques" (3rd Edition), by Gordon Linoff and Michael Berry, and learn how to create derived variables, which allow the statistical modeling process to incorporate human insights.
- Hurwitz: Cognitive Computing and Big Data Analytics - Mar 5, 2015.
New book from Judith Hurwitz and associates on Cognitive Computing is a comprehensive guide to the subject, providing both the theoretical and practical guidance technologists need.
CFP - Calls for Papers (see also All Calls for Papers )
- Due Mar 13, CIKM 2015 Tutorials proposals: The 24th ACM Int. Conf. on Information and Knowledge Management , Melbourne, Australia. 19-23 Oct 2015
- Due Mar 20, PAW: Predictive Analytics World for Healthcare , Boston, MA. Sep 27 - Oct 1, 2015 (Speaker days Sep 28-29)
- Due Mar 20, PAW: Predictive Analytics World for Business , Boston, MA. Sep 27 - Oct 1, 2015 (Speaker days Sep 28-29)
- Due Mar 21, IEEE VIS 2015 , Chicago, IL, USA. 25-30 Oct 2015
- Due Mar 26, ECMLPKDD 2015 Industry, Government, NGO Track: The European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases , Porto, Portugal. Sep 7-11, 2015
- Due Mar 26, ECMLPKDD 2015 Research Papers: The European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases , Porto, Portugal. Sep 7-11, 2015
- Due Mar 29, PAPIs, an international conference on predictive APIs and Apps (just before KDD-2015) , Sydney, Australia. 6-7 August, 2015
- Due Mar 31, The 2015 Int. Conf. on Data Mining (DMIN'15) , Las Vegas, NV, USA. July 27-30, 2015
- Due Apr 25, IEEE Big Data 2015 Call for Workshop Proposals , Santa Clara, CA, USA. Oct 29 - Nov 1, 2015
- Due Apr 30, ECMLPKDD 2015 : Call for Demos , Porto, Portugal. Sep 7-11, 2015
- Due May 1, CIKM 2015: Science For Data theme , Melbourne, Australia. Oct 19-23, 2015
- Due Jul 13, IEEE ICDM 2015 Call For Demos , Atlantic City, NJ, USA. Nov 14-17, 2015