- Adversarial Validation Overview - Feb 13, 2020.
Learn how to implement adversarial validation that builds a classifier to determine if your data is from the training or testing sets. If you can do this, then your data has issues, and your adversarial validation model can help you diagnose the problem.
- Why are Machine Learning Projects so Hard to Manage? - Feb 3, 2020.
What makes deploying a machine learning project so difficult? Is it the expectations? The people? The tech? There are common threads to these challenges, and best practices exist to deal with them.
- I wanna be a data scientist, but… how? - Jan 20, 2020.
It’s easy to say "I wanna be a data scientist," but... where do you start? How much time is needed to be desired by companies? Do you need a Master’s degree? Do you need to know every mathematical concept ever derived? The journey might be long, but follow this plan to help you keep moving forward toward your career goal.
- Choosing a Machine Learning Model - Oct 14, 2019.
Selecting the perfect machine learning model is part art and part science. Learn how to review multiple models and pick the best in both competitive and real-world applications.
- Is Kaggle Learn a “Faster Data Science Education?” - Aug 20, 2019.
Kaggle Learn is "Faster Data Science Education," featuring micro-courses covering an array of data skills for immediate application. Courses may be made with newcomers in mind, but the platform and its content is proving useful as a review for more seasoned practitioners as well.
- What 70% of Data Science Learners Do Wrong - Aug 2, 2019.
Lessons learned from repeatedly smashing my head with a 2-meter long metal pole for a college engineering course.
- KDnuggets™ News 19:n27, Jul 24: Bayesian deep learning and near-term quantum computers; DeepMind’s CASP13 Protein Folding Upset Summary - Jul 24, 2019.
This week on KDnuggets: Learn how DeepMind dominated the last CASP competition for advancing protein folding models; Bayesian deep learning and near-term quantum computers: A cautionary tale in quantum machine learning; The Evolution of a ggplot; Adapters: A Compact and Extensible Transfer Learning Method for NLP; 12 Things I Learned During My First Year as a Machine Learning Engineer; Things I Learned From the SciPy 2019 Lightning Talks; and much more!
- Kaggle Kernels Guide for Beginners: A Step by Step Tutorial - Jul 23, 2019.
This is an attempt to hold the hands of a complete beginner and walk them through the world of Kaggle Kernels — for them to get started.
- Show off your Data Science skills with Kaggle Kernels - Jun 14, 2019.
Kaggle is not just about data science competitions. They also have a platform called Kaggle Kernels, using which you can build a stellar data science portfolio.
- The Hitchhiker’s Guide to Feature Extraction - Jun 3, 2019.
Check out this collection of tricks and code for Kaggle and everyday work.
- What my first Silver Medal taught me about Text Classification and Kaggle in general? - May 13, 2019.
A first-hand account of ideas tried by a competitor at the recent kaggle competition 'Quora Insincere questions classification', with a brief summary of some of the other winning solutions.
- Explainable AI or Halting Faulty Models ahead of Disaster - Mar 27, 2019.
A brief overview of a new method for explainable AI (XAI), called anchors, introduce its open-source implementation and show how to use it to explain models predicting the survival of Titanic passengers.
- Good Feature Building Techniques and Tricks for Kaggle - Dec 31, 2018.
A selection of top tips to obtain great results on Kaggle leaderboards, including useful code examples showing how best to use Latitude and Longitude features.
- My secret sauce to be in top 2% of a Kaggle competition - Nov 26, 2018.
A collection of top tips on ways to explore features and build better machine learning models, including feature engineering, identifying noisy features, leakage detection, model monitoring, and more.
- How many data scientists are there and is there a shortage? - Sep 18, 2018.
We examine the famous McKinsey prediction from 2011 and look into whether there a shortage of people with analytical expertise and estimate how many Data Scientists are there.
- An Introduction to Deep Learning for Tabular Data - May 17, 2018.
This post will discuss a technique that many people don’t even realize is possible: the use of deep learning for tabular data, and in particular, the creation of embeddings for categorical variables.
- How I Used CNNs and Tensorflow and Lost a Silver Medal in Kaggle Challenge - May 8, 2018.
I joined the competition a month before it ended, eager to explore how to use Deep Natural Language Processing (NLP) techniques for this problem. Then came the deception. And I will tell you how I lost my silver medal in that competition.
- To Kaggle Or Not - May 2, 2018.
Kaggle is the most well known competition platform for predictive modeling and analytics. This article looks into the different aspects of Kaggle and the benefits it can bring to data scientists.
- How Do I Get My First Data Science Job? - Apr 2, 2018.
Here are the steps you need to obtain your first job in data science, including details on how to create a good portfolio, key networking tips, getting the right education and managing expectations.
- KDnuggets – Favorite Data Science / Machine Learning Blog - Mar 12, 2018.
In a recent Kaggle Machine Learning and Data Science Survey, KDnuggets was no. 1 among favorite Data Science Blogs, Podcasts, or Newsletters.
- The Doing Part of Learning Data Science - Feb 6, 2018.
Consider this a beginner’s answer to “Studied Basics, What Next?”
- The Art of Learning Data Science - Jan 9, 2018.
A beginner’s account of getting into comfort zone of learning Data Science.
- Data Science for Laymen: 5 Writers Who Speak Your Language - Dec 28, 2017.
Here are 5 excellent Data Scientists who are also very good at explaining concepts and interacting with you.
- Industry Predictions: Main AI, Big Data, Data Science Developments in 2017 and Trends for 2018 - Dec 19, 2017.
Here is a treasure trove of analysis and predictions from 17 leading companies in AI, Big Data, Data Science, and Machine Learning: What happened in 2017 and what will 2018 bring?
- Top KDnuggets tweets, Nov 08-14: Approaching (Almost) Any NLP Problem on #Kaggle; Choosing an Open Source #MachineLearning Library - Nov 15, 2017.
Also: What is the difference between Bagging and Boosting?; Which #Python package manager should you use?; The Practical Importance of Feature Selection.
- XGBoost: A Concise Technical Overview - Oct 27, 2017.
Interested in learning the concepts behind XGBoost, rather than just using it as a black box? Or, are you looking for a concise introduction to XGBoost? Then, this article is for you. Includes a Python implementation and links to other basic Python and R codes as well.
- XGBoost, a Top Machine Learning Method on Kaggle, Explained - Oct 3, 2017.
Looking to boost your machine learning competitions score? Here’s a brief summary and introduction to a powerful and popular tool among Kagglers, XGBoost.
- How to win Kaggle competition based on NLP task, if you are not an NLP expert - Sep 29, 2017.
Here is how we got one of the best results in a Kaggle challenge remarkable for a number of interesting findings and controversies among the participants.
Pages: 1 2
- Python vs R – Who Is Really Ahead in Data Science, Machine Learning? - Sep 12, 2017.
We examine Google Trends, job trends, and more and note that while Python has only a small advantage among current Data Science and Machine Learning related jobs, this advantage is likely to increase in the future.
- Lessons Learned From Benchmarking Fast Machine Learning Algorithms - Aug 16, 2017.
Boosted decision trees are responsible for more than half of the winning solutions in machine learning challenges hosted at Kaggle, and require minimal tuning. We evaluate two popular tree boosting software packages: XGBoost and LightGBM and draw 4 important lessons.
- Improving Zillow Zestimate with 36 Lines of Code - Jul 7, 2017.
We built this project as a quick and easy way to leverage some of the amazing technologies that are being built by the data science community!
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part 3 - Jul 4, 2017.
In this last post of the series, I describe how I used more powerful machine learning algorithms for the click prediction problem as well as the ensembling techniques that took me up to the 19th position on the leaderboard (top 2%)
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part 2 - Jun 27, 2017.
In this post, I describe the competition evaluation, the design of my cross-validation strategy and my baseline models using statistics and trees ensembles.
Pages: 1 2
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part I - Jun 8, 2017.
As I scroll through the leaderboard page, I found my name in the 19th position, which was the top 2% from nearly 1,000 competitors. Not bad for the first Kaggle competition I had decided to put a real effort in!
- Top /r/MachineLearning Posts, March: A Super Harsh Guide to Machine Learning; Is it Gaggle or Koogle?!? - Apr 4, 2017.
A Super Harsh Guide to Machine Learning; Google is acquiring data science community Kaggle; Suggestion by Salesforce chief data scientist; Andrew Ng resigning from Baidu; Distill: An Interactive, Visual Journal for Machine Learning Research
- Bad Data + Good Models = Bad Results - Jan 26, 2017.
No matter how advanced is your Machine Learning algorithm, the results will be bad if the input data
is bad. We examine one popular IMDB dataset and discuss how an analyst can deal with such data.
- Going to War with the Giants: Automated Machine Learning with MLJAR - Jan 19, 2017.
The performance of automated machine learning tool MLJAR on Kaggle competition data is presented in comparison with those from other predictive APIs from Amazon, Google, PredicSis and BigML.
- Exclusive: Interview with Jeremy Howard on Deep Learning, Kaggle, Data Science, and more - Jan 14, 2017.
My exclusive interview with rock star Data Scientist Jeremy Howard, on his latest Deep Learning course, what is needed for success in Kaggle, how Enlitic is transforming medical diagnostics, and what Data Scientists should do to create value for their organization.
- First Deep Learning for coders MOOC launched by Jeremy Howard - Dec 21, 2016.
Leading Data Scientist and entrepreneur Jeremy Howard launches a free Deep Learning course that shows end-to-end how to get state of the art results, including a top place in a Kaggle competition.
- Interviews with Data Scientists: Claudia Perlich - Dec 2, 2016.
In this wide-ranging interview, Roberto Zicari talks to a leading Data Scientist Claudia Perlich about what they must know about Machine Learning and evaluation, domain knowledge, data blending, and more.
Pages: 1 2
- KDnuggets™ News 16:n41, Nov 16: Top 10 Amazon Books in Data Mining; Intuitive Explanation of Convolutional Neural Nets - Nov 16, 2016.
Also An Intuitive Explanation of Convolutional Neural Networks; Data Scientists vs Data Analysts - Part 1; How to Rank 10% in Your First Kaggle Competition.
- How to Rank 10% in Your First Kaggle Competition - Nov 9, 2016.
This post presents a pathway to achieving success in Kaggle competitions as a beginner. The path generalizes beyond competitions, however. Read on for insight into succeeding while approaching any data science project.
Pages: 1 2 3 4
- Agilience Top Artificial Intelligence, Machine Learning Authorities - Nov 7, 2016.
Agilience developed a new way to find authorities in social media across many fields of interest. In previous post we reviewed the top authorities in Data Mining and Data science; in this post we review top authorities in Artificial Intelligence and Machine Learning which includes Vineet Vashishta, Kirk D. Borne, KDnuggets, James Kobielus, Kaggle and more.
Pages: 1 2
- Agilience Top Data Mining, Data Science Authorities - Nov 4, 2016.
Agilience developed a new way to find authorities in social media across many fields of interest. We review the top authorities in Data Mining and Data science, which include KDnuggets, Kirk. D. Borne, Kaggle, Vincent Granville, and more.
Pages: 1 2
- Approaching (Almost) Any Machine Learning Problem - Aug 18, 2016.
If you're looking for an overview of how to approach (almost) any machine learning problem, this is a good place to start. Read on as a Kaggle competition veteran shares his pipelines and approach to problem-solving.
Pages: 1 2
- MICCAI 2016 Cancer Radiomics Challenge - Aug 9, 2016.
Details on the ongoing MICCAI 2016 Cancer Radiomics Challenge, organized by University of Texas MD Anderson Cancer Center radiation oncology team, hosted on Kaggle, and being held until September 12th.
- Data Science Automation: Debunking Misconceptions - Aug 2, 2016.
This opinion piece aims to clear up some proposed misconceptions surrounding data science automation.
- Would You Survive the Titanic? A Guide to Machine Learning in Python Part 3 - Jul 27, 2016.
This is the final part of a 3 part introductory series on machine learning in Python, using the Titanic dataset.
Pages: 1 2
- TalkingData Data Science Competition: understand mobile users - Jul 12, 2016.
Unique opportunity to solve complex real world big data challenges for the China mobile market - predict users demographic characteristics based on their app usage, geolocation, and mobile device properties.
- KDnuggets™ News 16:n23, Jun 29: Machine Learning Trends & Future of AI; Data Science Kaggle Walkthrough; Regularization in Logistic Regression - Jun 29, 2016.
- Doing Data Science: A Kaggle Walkthrough Part 6 – Creating a Model - Jun 24, 2016.
In the final part of this 6 part series on the process of data science, and applying it to a Kaggle competition, building the predictive models is covered, and multiple algorithms are discussed.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 5 – Adding New Data - Jun 17, 2016.
Here is part 5 of the weekly 6 part series on doing data science in the context of a Kaggle competition, which concentrates on adding in new data.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 4 – Data Transformation and Feature Extraction - Jun 10, 2016.
Part 4 of this fantastic 6 part series covering the process of data science, and its application to a Kaggle competition, focuses on feature extraction and data transformation.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 3 – Cleaning Data - Jun 3, 2016.
This is part three in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. In this episode, data cleaning and preparation is covered.
Pages: 1 2
- Doing Data Science: A Kaggle Walkthrough Part 2 – Understanding the Data - May 27, 2016.
This is the second post in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. Read on for a great overview of practicing data science.
Pages: 1 2
- KDnuggets™ News 16:n19, May 25: Explain Machine Learning to Software Engineer; 5 Can’t Miss Machine Learning Projects - May 25, 2016.
How to Explain Machine Learning to a Software Engineer; 5 Machine Learning Projects You Can No Longer Overlook; Doing Data Science: A Kaggle Walkthrough Part 1 - Introduction; The Amazing Power of Word Vectors
- Doing Data Science: A Kaggle Walkthrough Part 1 – Introduction - May 19, 2016.
This is the first post in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. Very thorough, and very insightful.
- From Science to Data Science, a Comprehensive Guide for Transition - Apr 12, 2016.
An in-depth, multifaceted, and all-around very helpful roadmap for making the switch from 'science' to 'data science,' yet generally useful for data science beginners or anyone looking to get into data science.
Pages: 1 2 3
- How To Become A Machine Learning Expert In One Simple Step - Mar 29, 2016.
This post looks at perhaps the most important, and often overlooked, step in learning machine learning, an aspect which can make the biggest difference in one's skill set.
- XGBoost: Implementing the Winningest Kaggle Algorithm in Spark and Flink - Mar 24, 2016.
An overview of XGBoost4J, a JVM-based implementation of XGBoost, one of the most successful recent machine learning algorithms in Kaggle competitions, with distributed support for Spark and Flink.
- Doing Data Science: A Kaggle Walkthrough – Cleaning Data - Mar 23, 2016.
Gain insight into the process of cleaning data for a specific Kaggle competition, including a step by step overview.
Pages: 1 2
- The Data Science Game – Student Competition - Mar 17, 2016.
The Data Science Game returns this year, with university students competing for dominance. Details for this iteration and further information is provided here.
- The Machine Learning Problem of The Next Decade - Feb 26, 2016.
How can businesses integrate imperfect machine-learning algorithms into their workflow?
Pages: 1 2
- KDnuggets™ News 16:n03, Jan 27: Secret to winning Kaggle; Better Dataviz; Where Analytics is applied - Jan 27, 2016.
Learning to Code Neural Networks; The secrets to winning Kaggle; 3 Simple Resolutions to Design Better DataViz; Data Scientist - best job in America.
- Anthony Goldbloom gives you the Secret to winning Kaggle competitions - Jan 20, 2016.
Kaggle CEO shares insights on best approaches to win Kaggle competitions, along with a brief explanation of how Kaggle competitions work.
- KDnuggets™ News 15:n42, Dec 29: Where did you apply Analytics? 5 ways Data Scientists keep learning - Dec 29, 2015.
Poll: Industries/Fields where you applied Analytics, Data Mining, Data Science in 2015; 5 Ways Data Scientists Keep Learning After College; Lessons from 2M Machine Learning Models on Kaggle; 10 BI Trends for 2016.
- Tour of Real-World Machine Learning Problems - Dec 26, 2015.
The tour lists 20 interesting real-world machine learning problems for data science enthusiasts to learn by solving.
- Lessons from 2 Million Machine Learning Models on Kaggle - Dec 24, 2015.
Lessons from Kaggle competitions, including why XG Boosting is the top method for structured problems, Neural Networks and deep learning dominate unstructured problems (visuals, text, sound), and 2 types of problems for which Kaggle is suitable.
- 5 Ways Data Scientists Keep Learning After College - Dec 17, 2015.
Taken from the answers experts gave, here is a compiled list of 5 essential actions and attitudes that keep data scientists learning after their degrees.
Pages: 1 2
- The hardest parts of data science - Nov 24, 2015.
The hardest part of data science is not building an accurate model or obtaining good, clean data, but defining feasible problems and coming up with reasonable ways of measuring solutions.
- The 123 Most Influential People in Data Science - Sep 15, 2015.
We used LittleBird algorithm to build a true Data Science influencer network by measuring how often influencers retweet other influencers. Top influencers include @hmason, @kdnuggets, @kaggle, @peteskomoroch, @mrogati, and @KirkDBorne.
- KDnuggets™ News 15:n21, Jul 1: Top 20 R packages; Using Ensembles in Kaggle; Tutorials and How-Tos - Jul 1, 2015.
Top 20 R packages by popularity; Tutorials, Overviews, How-Tos; Open Source Enabled Interactive Analytics; Using Ensembles in Kaggle Data Science Competitions.
- Top KDnuggets tweets, Jun 22-29: Kaggle Machine Learning Tutorial in R; 50 Smartest Companies – shaping the #technology landscape - Jun 30, 2015.
Free @Kaggle #MachineLearning Tutorial in R - learn how to compete; 50 Smartest Companies - shaping the #technology landscape; Excellent Tutorial on #Sequence #Learning using #Recurrent #Neural #Networks; How a #DataScientist buys a #car.
- Using Ensembles in Kaggle Data Science Competitions- Part 3 - Jun 27, 2015.
Earlier, we showed how to create stacked ensembles with stacked generalization and out-of-fold predictions. Now we'll learn how to implement various stacking techniques.
- Using Ensembles in Kaggle Data Science Competitions – Part 2 - Jun 26, 2015.
Aspiring to be a Top Kaggler? Learn more methods like Stacking & Blending. In the previous post we discussed about ensembling models by ways of weighing, averaging and ranks. There is much more to explore in Part-2!
- Using Ensembles in Kaggle Data Science Competitions – Part 1 - Jun 25, 2015.
How to win Machine Learning Competitions? Gain an edge over the competition by learning Model Ensembling. Take a look at Henk van Veen's insights about how to get improved results!
Pages: 1 2
- Top KDnuggets tweets, Jun 16-22: Deep Learning resources from O’Reilly; Free Kaggle Machine Learning Tutorial in R - Jun 23, 2015.
#DeepLearning resources from @OReillyMedia to help you get started; Free @Kaggle #MachineLearning Tutorial in R - learn how to compete in #DataScience; Data Scientists, enjoy your fat salaries while you can; Computational Aesthetics #Algorithm Spots #Beauty That Humans Overlook.
- Top /r/MachineLearning Posts, May: Unreasonable Effectiveness of Recurrent Neural Networks, Time-Lapse Mining - Jun 1, 2015.
The Unreasonable Effectiveness of Recurrent Neural Networks, Time-lapse mining from Net photos, Deep Learning Textbook Part I, Kaggle R Tutorial, and Free Machine Learning ebooks.
- How to Lead a Data Science Contest without Reading the Data - May 17, 2015.
We examine a “wacky” boosting method that lets you climb the public leaderboard without even looking at the data . But there is a catch, so read on before trying to win Kaggle competitions with this approach.
- Should Data Science Really Do That? - May 13, 2015.
Data Science amazing progress in its ability to do predictions and analysis is raising important ethical questions, such as should that data be collected? Should the collected data be used for that application? Should you be involved?
- Talking Machine – Deep Learning in Speech Recognition - May 2, 2015.
A summary about an episode on the talking machine about deep neural networks in speech recognition given by George Dahl, who is one of Geoffrey Hinton’s students and just defended his Ph.D last month.
- Kaggle Competition (Facebook recruiting): Human or Robot? - Apr 28, 2015.
Facebook and Kaggle are launching an Engineering competition for 2015 - leaders will earn an opportunity to interview for a software engineer at Facebook, working on world class Machine Learning problems.
In this competition, you'll be chasing down robots for an online auction site.
- Top 10 R Packages to be a Kaggle Champion - Apr 21, 2015.
Kaggle top ranker Xavier Conort shares insights on the “10 R Packages to Win Kaggle Competitions”.
- Top /r/MachineLearning Posts, Apr 5-11: Amazon Machine Learning, Numerical Optimization, and Conditional Random Fields - Apr 14, 2015.
Amazon Machine Learning as a Service, Numerical Optimization, Extracting data from NYTimes recipes, Intro to Machine Learning with sci-kit, and more.
- Top KDnuggets tweets, Apr 2-5: The Data Science ecosystem: Data wrangling useful tools and tips - Apr 6, 2015.
The #datascience ecosystem part 2: Data wrangling useful tools and tips; 10 R Packages to Win Kaggle Competitions; Forrester Wave #BigData Predictive #Analytics Solutions 2015, gainers, losers; How Microsoft uses Big Data to predict traffic jams in advance.
- KDnuggets™ News 15:n09, Mar 25: Deep Learning from Scratch; 10 steps to Kaggle Success; US CDS DJ Patil Cartoon - Mar 25, 2015.
Deep Learning for Text Understanding from Scratch; New Poll: Computing platform; 10 Steps to Success in Kaggle Data; Cartoon: US Chief Data Scientist Most Difficult Challenge; SQL-like Query Language for Real-time Streaming Analytics.
- Top KDnuggets tweets, Mar 09-11: Learning path from noob to Kaggler in Python; 10 steps for success in Kaggle competitions - Mar 12, 2015.
Comprehensive learning path from noob to Kaggler in Python; 10 steps for success in Kaggle competitions; Machine learning packages #Python #Java #BigData #Lua #Clojure #Scala, R; Very useful LeaRning Path on R - Step by Step Guide.
- Feb 2015 Analytics, Big Data, Data Mining Acquisitions and Startups Activity - Mar 12, 2015.
Feb 2015 acquisitions, startups, and company activity in Analytics, Big Data, Data Mining, and Data Science: @Kaggle cuts 1/3 of staff, Infosys buys Panaya, RapidMiner gets $15M, Palantir buys Fancy That, Hitachi buys Pentaho, and more.
- 10 Steps to Success in Kaggle Data Science Competitions - Mar 11, 2015.
The author, ranked in top 10 in five Kaggle competitions, shares his 10 steps for success. These also apply to any well-defined predictive analytics or modeling problem with a closed dataset.
Pages: 1 2 3
- OpenDataSciCon – Data Science for All: What Do You Need ? - Mar 10, 2015.
“All you need is knowledge.” Open source is what paves the road to that knowledge and the Open Data Science Conference, Boston, May 30-31, 2015.
- Top KDnuggets tweets, Jan 5-11: Data Driven: Creating a Data Culture; Deep Learning in a Nutshell - Jan 12, 2015.
New book: Data Driven: Creating a Data Culture, by top #DataScience experts; #DeepLearning in a Nutshell: what it is, how does it work, and why; Programming languages popularity by US state; 4 ways to identify the best #data #tools.
- Top KDnuggets tweets, Jan 7-8: Programming languages popularity by US state; Machine Learning best practices from Kaggle competitions - Jan 9, 2015.
Programming languages popularity by US state; Why Ayasdi Topological Data Analysis Works - real data frequently is nonlinear; Learning Data Science and Predictive Modeling at Your Own Pace; Great talk: Machine Learning best practices from Kaggle competitions.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Jan 6, 2015 and beyond - Jan 6, 2015.
Hadoop, Top BI Trends, Enter a KDD Cup or Kaggle, YARN, In-Database Analytics Deep Dive, Data Mining: Failure to Launch, and more.
- Enter a KDD Cup or Kaggle Competition. You don’t need to be an expert! - Jan 4, 2015.
The webinar will show on the example of KDD Cup 2009 how Salford TreeNet can quickly achieve a top 5 result, and how to quickly build great models even if you are not an expert.
- National Data Science Bowl: Predict Ocean Health - Dec 16, 2014.
Enter the 1st ever National Data Science bowl, with 175K in prizes and build an algorithm to automate the plankton image identification across 100+ classes. Plankton are critically important to ecosystem, but traditional methods for measuring their populations are time consuming and cannot scale for large-scale studies.
- FICO Analytics Competition: Helping Santa Helpers - Dec 11, 2014.
FICO is sponsoring a Kaggle competition to optimize Santa's scheduling algorithm. Competitors can win in multiple categories and the best solution using FICO Xpress Optimization Suite will receive a special prize.
- Boston Data Festival Celebrates Big Data Community, Nov 3-8 - Oct 16, 2014.
Celebrate the big data community, see many world-class speakers, and participate in insightful events at this year's Boston Data Festival. The event takes place November 3-8.
- Top KDnuggets tweets, Oct 10-12: 7 Most Data Rich Companies in the World - Oct 13, 2014.
7 Most Data Rich Companies in the World; R and #DataScience Webinar slides - status, why, code examples; Another list of 200+ #BigData thought leaders to follow on Twitter; Popular #BigData predictive apps and APIs.
- Top KDnuggets tweets, Oct 8-9: Clinical data determines only 10% of health; Kaggle hero 100-line Python code - Oct 10, 2014.
IBM #Watson presentation: Clinical data determines only 10% of health; A @Kaggle hero 100-line Python code for online logistic regression; The Winner of Kaggle Criteo Data Science on his Odyssey; For Data Viz lovers: Keynote by Tableau CEO Christian Chabot on "Art of Analytics".
- Kaggle Epilepsy Seizure Prediction Challenge - Aug 28, 2014.
Create a forecasting system for predicting epileptic seizures in this Kaggle challenge to help improve the lives of epilepsy patients and win prizes. Competition ends on November 17.
- OpenML: Share, Discover and Do Machine Learning - Aug 11, 2014.
OpenML is designed to share, organize and reuse data, code and experiments, so that scientists can make discoveries more efficiently. It is an interesting idea to build a network of machine learning.
- Top KDnuggets tweets, May 16-18: Great find – Intro to Data Science, free download; Why code written by scientists gets ugly - May 19, 2014.
Great find! Intro. to Data Science, v2 (170 pages), free download; Why code written by scientists gets ugly; A Statistician's View on #BigData and Data Science - updated; CIOReview Top 100 Most Promising Big Data Companies.
- KDD Cup 2014 – Predicting Excitement at DonorsChoose.org - May 16, 2014.
Predict which Donor Choose projects will be exciting. 2014 edition of KDD Cup, the first data mining competition, is on Kaggle. Submissions due June 15.
- Top KDnuggets tweets, Mar 24-25: Is a Data Science Certificate sufficient? Kaggle branches beyond competitions - Mar 26, 2014.
My answer to Is a Data Science Certificate sufficient to become a data scientist? Kaggle branches beyond data mining competitions, will build oil and gas vertical solution; Vicarious, developing brain-inspired machine learning, gets $40M from Mark and Elon; OkCupid "Love" Analytics finds best three questions.
- How Many Data Scientists are out there? - Mar 13, 2014.
We examine indeed, LinkedIn, Kaggle, and other sources to investigate how many data scientists - in name and in function - are out there, and how strong is the demand.
- Introduction to Random Forests® for Beginners – free ebook - Mar 6, 2014.
Random Forests is of the most powerful and successful machine learning techniques. This free ebook will help beginners to leverage the power of Random Forests.
- Kaggle March Machine Learning Mania - Feb 14, 2014.
Can you turn 20 years of historical data into predictions for 2014 NCAA College Basketball Tournament, aka March Madness? Enter this Intel-sponsored tournament - predictions due Mar 19.
- Top KDnuggets tweets, Feb 10-11: Data scientist cartoon – too busy recommending; Julia: One Language to Rule Them All - Feb 12, 2014.
Data scientist cartoon - too busy recommending things ...; Julia: One Programming Language to Rule Them All; Anaconda: free enterprise-ready Python distribution for large-scale data processing; 10 Most Innovative Companies in #BigData: GE, Kaggle, Ayasdi, IBM, Mount Sinai ...
- FastCompany 10 Most Innovative Companies in Big Data - Feb 10, 2014.
Big Data-driven companies can now map your genome, find the best fit for clothing, and improve student grades, but it all comes at the expense of much reduced privacy. Here are 10 most innovative Big Data companies according to Fast Company.
- Deep Learning Wins Dogs vs Cats competition on Kaggle - Feb 5, 2014.
A Deep learning expert wins Kaggle Dogs vs Cats image competition with an almost perfect result.