- Popular Machine Learning Interview Questions - Jan 20, 2021.
Get ready for your next job interview requiring domain knowledge in machine learning with answers to these eleven common questions.
Bias, Confusion Matrix, Interview Questions, Machine Learning, Overfitting, Variance
- Can you trust AutoML? - Dec 23, 2020.
Automated Machine Learning, or AutoML, tries hundreds or even thousands of different ML pipelines to deliver models that often beat the experts and win competitions. But, is this the ultimate goal? Can a model developed with this approach be trusted without guarantees of predictive performance? The issue of overfitting must be closely considered because these methods can lead to overestimation -- and the Winner's Curse.
Accuracy, AutoML, Cross-validation, Machine Learning, Model Performance, Overfitting
- 6 Common Mistakes in Data Science and How To Avoid Them - Sep 10, 2020.
As a novice or seasoned Data Scientist, your work depends on the data, which is rarely perfect. Properly handling the typical issues with data quality and completeness is crucial, and we review how to avoid six of these common scenarios.
Advice, Data Quality, Data Science, Hyperparameter, Mistakes, Overfitting
- 4 ways to improve your TensorFlow model – key regularization techniques you need to know - Aug 27, 2020.
Regularization techniques are crucial for preventing your models from overfitting and enables them perform better on your validation and test sets. This guide provides a thorough overview with code of four key approaches you can use for regularization in TensorFlow.
Machine Learning, Overfitting, Regularization, TensorFlow
- Fighting Overfitting in Deep Learning - Dec 27, 2019.
This post outlines an attack plan for fighting overfitting in neural networks.
Deep Learning, Keras, Neural Networks, Overfitting, Python, Regularization, Transfer Learning
- 5 Techniques to Prevent Overfitting in Neural Networks - Dec 6, 2019.
In this article, I will present five techniques to prevent overfitting while training neural networks.
Neural Networks, Overfitting
- Reproducibility, Replicability, and Data Science - Nov 19, 2019.
As cornerstones of scientific processes, reproducibility and replicability ensure results can be verified and trusted. These two concepts are also crucial in data science, and as a data scientist, you must follow the same rigor and standards in your projects.
Best Practices, Data Science, Overfitting, Reproducibility, Trust, Validation
- Generalization in Neural Networks - Nov 18, 2019.
When training a neural network in deep learning, its performance on processing new data is key. Improving the model's ability to generalize relies on preventing overfitting using these important methods.
Complexity, Deep Learning, Dropout, Neural Networks, Overfitting, Regularization, Training Data
- 6 bits of advice for Data Scientists - Sep 25, 2019.
As a data scientist, you can get lost in your daily dives into the data. Consider this advice to be certain to follow in your work for being diligent and more impactful for your organization.
Advice, Data Cleaning, Data Scientist, Metrics, Overfitting, Statistics
- The Hidden Risk of AI and Big Data - Sep 20, 2019.
With recent advances in AI being enabled through access to so much “Big Data” and cheap computing power, there is incredible momentum in the field. Can big data really deliver on all this hype, and what can go wrong?
AI, Big Data, Causation, Correlation, Overfitting, Risks
- Common Machine Learning Obstacles - Sep 9, 2019.
In this blog, Seth DeLand of MathWorks discusses two of the most common obstacles relate to choosing the right classification model and eliminating data overfitting.
Cross-validation, Decision Trees, Logistic Regression, Machine Learning, MathWorks, Overfitting, SVM
- Can we trust AutoML to go on full autopilot? - Jul 31, 2019.
We put an AutoML tool to the test on a real-world problem, and the results are surprising. Even with automatic machine learning, you still need expert data scientists.
Automated Machine Learning, AutoML, Overfitting, Time Series
- Careful! Looking at your model results too much can cause information leakage - May 24, 2019.
We all are aware of the issue of overfitting, which is essentially where the model you build replicates the training data results so perfectly its fitted to the training data and does not generalise to better represent the population the data comes to, with catastrophic results when you feed in new data and get very odd results.
Cross-validation, Modeling, Overfitting, Validation
- How To Fine Tune Your Machine Learning Models To Improve Forecasting Accuracy - Jan 23, 2019.
We explain how to retrieve estimates of a model's performance using scoring metrics, before taking a look at finding and diagnosing the potential problems of a machine learning algorithm.
Cross-validation, Forecasting, Machine Learning, Overfitting, Time Series
- Why Ice Cream Is Linked to Shark Attacks – Correlation/Causation Smackdown - Jan 19, 2019.
Why are soda and ice cream each linked to violence? This article delivers the final word on what people mean by "correlation does not imply causation."
Causality, Causation, Correlation, Overfitting
- Why Vegetarians Miss Fewer Flights – Five Bizarre Insights from Data - Jan 12, 2019.
A frenzy of number-crunching is churning out a heap of insights that are colorful, sometimes surprising, and often valuable. We explain how this works, and investigate five bizarre discoveries found in data.
Credit Risk, Eric Siegel, Healthcare, Machine Learning, Overfitting, Uber
- The brain as a neural network: this is why we can’t get along - Dec 19, 2018.
This article sets out to answer the question: what insights can we gain about ourselves by thinking of the brain as a machine learning model?
Brain, Confirmation Bias, Neural Networks, Overfitting, Politics
- Labeling Unstructured Text for Meaning to Achieve Predictive Lift - Oct 31, 2018.
In this post, we examine several advance NLP techniques, including: labeling nouns and noun phrases for meaning, labeling (most often) adverbs and adjectives for sentiment, and labeling verbs for intent.
NLP, Overfitting, Text Mining, Unstructured data
- Improving the Performance of a Neural Network - May 30, 2018.
There are many techniques available that could help us achieve that. Follow along to get to know them and to build your own accurate neural network.
Ensemble Methods, Hyperparameter, Neural Networks, Overfitting, Tips
- 8 Common Pitfalls That Can Ruin Your Prediction - Mar 21, 2018.
A good prediction can help your work and make it easier. But how can you be sure that your prediction is good? Here are some common pitfalls that you should avoid.
Advice, Data Science, Outliers, Overfitting, Predictive Analytics
- Regularization in Machine Learning - Jan 10, 2018.
Regularization is a technique that helps to avoid overfitting and also make a predictive model more understandable.
Machine Learning, Overfitting, Regularization
- 4 Common Data Fallacies That You Need To Know - Dec 5, 2017.
In this post you will find a list of common the data fallacies that lead to incorrect conclusions and poor decision-making using data. Here you will find great resources and information so that you can always be reminded of these fallacies when you’re working with data.
Causality, Overfitting, Simpson's Paradox
- Understanding overfitting: an inaccurate meme in Machine Learning - Aug 23, 2017.
Applying cross-validation prevents overfitting is a popular meme, but is not actually true – it more of an urban legend. We examine what is true and how overfitting is different from overtraining.
Cross-validation, Machine Learning, Overfitting
- Making Predictive Models Robust: Holdout vs Cross-Validation - Aug 11, 2017.
The validation step helps you find the best parameters for your predictive model and prevent overfitting. We examine pros and cons of two popular validation strategies: the hold-out strategy and k-fold.
Cross-validation, Dataiku, Overfitting
- The Truth About Bayesian Priors and Overfitting - Jul 25, 2017.
Many of the considerations we will run through will be directly applicable to your everyday life of applying Bayesian methods to your specific domain.
Bayesian, Overfitting
- How to Lie with Data - Apr 20, 2017.
We expect data scientists to be objective, but intentionally or not, they can produce results that mislead. We examine three common types of “lies” that Data Scientists should be aware of.
Confirmation Bias, Data Visualization, Mistakes, Overfitting
- Proxy Indicators: beware of spurious claims - Mar 16, 2017.
Beware of online and market research studies which can lead to false or spurious claims. We examine several notable examples including Google Street View and Argentina inflation.
Argentina, Fake News, Google, Market Research, Overfitting
- 17 More Must-Know Data Science Interview Questions and Answers, Part 2 - Feb 22, 2017.
The second part of 17 new must-know Data Science Interview questions and answers covers overfitting, ensemble methods, feature selection, ground truth in unsupervised learning, the curse of dimensionality, and parallel algorithms.
Algorithms, Data Science, Ensemble Methods, Feature Engineering, Feature Selection, High-dimensional, Interview Questions, Overfitting, Unsupervised Learning
- 17 More Must-Know Data Science Interview Questions and Answers - Feb 15, 2017.
17 new must-know Data Science Interview questions and answers include lessons from failure to predict 2016 US Presidential election and Super Bowl LI comeback, understanding bias and variance, why fewer predictors might be better, and how to make a model more robust to outliers.
Pages: 1 2
Anomaly Detection, Bias, Classification, Data Science, Donald Trump, Interview Questions, Outliers, Overfitting, Variance
- Sound Data Science: Avoiding the Most Pernicious Prediction Pitfall - Jan 5, 2017.
Data science and predictive analytics can provide huge value, but they can mislead and backfire if not used with fail-safe measures. The author gives examples of such problems and provides guidelines to avoid them.
Advice, Data Science, Model Performance, Overfitting, Predictive Analytics, Statistical Modeling
- 4 Reasons Your Machine Learning Model is Wrong (and How to Fix It) - Dec 21, 2016.
This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.
Bias, Overfitting, Variance
- Data Science Basics: 3 Insights for Beginners - Sep 22, 2016.
For data science beginners, 3 elementary issues are given overview treatment: supervised vs. unsupervised learning, decision tree pruning, and training vs. testing datasets.
Algorithms, Beginners, Datasets, Overfitting, Supervised Learning, Unsupervised Learning
- A Neat Trick to Increase Robustness of Regression Models - Aug 22, 2016.
Read this take on the validity of choosing a different approach to regression modeling. Why isn't L1 norm used more often?
CleverTap, Linear Regression, Outliers, Overfitting, Regression
- Troubleshooting Neural Networks: What is Wrong When My Error Increases? - May 13, 2016.
An overview of some of the things that could lead to an increased error rate in neural network implementations.
Deep Learning, Neural Networks, Overfitting
- The Mirage of a Citizen Data Scientist - Mar 1, 2016.
The term "citizen data scientist" has been irritating me recently. I explain why I think it both a bad term and a bad idea, and what we need instead.
Citizen Data Scientist, Data Analyst, Data Scientist, Gartner, Overfitting
- 21 Must-Know Data Science Interview Questions and Answers, part 2 - Feb 20, 2016.
Second part of the answers to 20 Questions to Detect Fake Data Scientists, including controlling overfitting, experimental design, tall and wide data, understanding the validity of statistics in the media, and more.
Pages: 1 2 3
Anomaly Detection, Data Science, Data Visualization, Overfitting, Recommender Systems
- Big Idea To Avoid Overfitting: Reusable Holdout to Preserve Validity in Adaptive Data Analysis - Aug 17, 2015.
Big Data makes it all too easy find spurious "patterns" in data. A new approach helps avoid overfitting by using 2 key ideas: validation should not reveal any information about the holdout data, and adding of a small amount of noise to any validation result.
Pages: 1 2
Holdout, Model Performance, Overfitting, P-value, Vitaly Feldman
- Surprising Random Correlations - May 14, 2015.
An interesting demo showing how easy it is to find surprising correlations in real data. Is German unemployment rate related to Apple Stock? Is 10-year Treasury rate related to price of Red Winter Wheat? You will be surprised.
Correlation, Overfitting, Quandl, Random
- Data Science 101: Preventing Overfitting in Neural Networks - Apr 17, 2015.
Overfitting is a major problem for Predictive Analytics and especially for Neural Networks. Here is an overview of key methods to avoid overfitting, including regularization (L2 and L1), Max norm constraints and Dropout.
Pages: 1 2
Neural Networks, Nikhil Buduma, Overfitting, Regularization
- 7 common mistakes when doing Machine Learning - Mar 7, 2015.
In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.
Pages: 1 2
Machine Learning, Mistakes, Overfitting, Regression, SVM
- 10 things statistics taught us about big data analysis - Feb 10, 2015.
There are 10 ideas in applied statistics are relevant for big data analysis, focusing on prediction accuracy, interactive analysis and more.
Best Practices, Big Data, Overfitting, Statistics
- 11 Clever Methods of Overfitting and how to avoid them - Jan 2, 2015.
Overfitting is the bane of Data Science in the age of Big Data. John Langford reviews "clever" methods of overfitting, including traditional, parameter tweak, brittle measures, bad statistics, human-loop overfitting, and gives suggestions and directions for avoiding overfitting.
Cross-validation, John Langford, Overfitting
- Big Data accelerates medical research? Or not? - Oct 26, 2014.
Take a look at how big data in healthcare brings big opportunities, but along with those opportunities come great risk if statistics aren't carefully applied to those large datasets.
Big Data, Healthcare, Overfitting, Research
- The Cardinal Sin of Data Mining and Data Science: Overfitting - Jun 14, 2014.
Overfitting leads to public losing trust in research findings, many of which turn out to be false. We examine some famous examples, "the decline effect", Miss America age, and suggest approaches for avoiding overfitting.
Dean Abbott, John Ioannidis, Kirk D. Borne, Overfitting, S&P 500