- Can you trust AutoML? - Dec 23, 2020.
Automated Machine Learning, or AutoML, tries hundreds or even thousands of different ML pipelines to deliver models that often beat the experts and win competitions. But, is this the ultimate goal? Can a model developed with this approach be trusted without guarantees of predictive performance? The issue of overfitting must be closely considered because these methods can lead to overestimation -- and the Winner's Curse.
- Here’s what you need to look for in a model server to build ML-powered services - Sep 15, 2020.
More applications are being infused with machine learning while MLOps processes and best practices are becoming well established. Critical to these software and systems are the servers that run the models, which should feature key capabilities to drive successful enterprise-scale productionizing of machine learning.
- Top KDnuggets tweets, Aug 5-11: Unselfie: Translating Selfies to Neutral-pose Portraits in the Wild - Aug 12, 2020.
Unselfie: Translating Selfies to Neutral-pose Portraits in the Wild; How to Evaluate the Performance of Your Machine Learning Model; Deep Learning Most Important Ideas - an excellent review
- Applying Occam’s razor to Deep Learning - Jan 10, 2020.
Finding a deep learning model to perform well is an exciting feat. But, might there be other -- less complex -- models that perform just as well for your application? A simple complexity measure based on the statistical physics concept of Cascading Periodic Spectral Ergodicity (cPSE) can help us be computationally efficient by considering the least complex during model selection.
- The Ultimate Guide to Model Retraining - Dec 16, 2019.
Once you have deployed your machine learning model into production, differences in real-world data will result in model drift. So, retraining and redeploying will likely be required. In other words, deployment should be treated as a continuous process. This guide defines model drift and how to identify it, and includes approaches to enable model training.
- Machine Learning 101: The What, Why, and How of Weighting - Nov 26, 2019.
Weighting is a technique for improving models. In this article, learn more about what weighting is, why you should (and shouldn’t) use it, and how to choose optimal weights to minimize business costs.
- From Data Pre-processing to Optimizing a Regression Model Performance - Jul 19, 2019.
All you need to know about data pre-processing, and how to build and optimize a regression model using Backward Elimination method in Python.
- How do you teach physics to machine learning models? - May 21, 2019.
How to integrate physics-based models (these are math-based methods that explain the world around us) into machine learning models to reduce its computational complexity.
- Comparing Machine Learning Models: Statistical vs. Practical Significance - Jan 18, 2019.
Is model A or B more accurate? Hmm… In this blog post, I’d love to share my recent findings on model comparison.
- 5 Machine Learning Projects You Should Not Overlook, June 2018 - Jun 12, 2018.
Here is a new installment of 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out!
- Train your Deep Learning Faster: FreezeOut - Aug 3, 2017.
We explain another novel method for much faster training of Deep Learning models by freezing the intermediate layers, and show that it has little or no effect on accuracy.
- Train your Deep Learning model faster and sharper: Snapshot Ensembling — M models for the cost of 1 - Aug 2, 2017.
We explain a novel Snapshot Ensembling method for increasing accuracy of Deep Learning models while also reducing training time.
- The Top Predictive Analytics Pitfalls to Avoid - Jan 23, 2017.
Predictive modelling and machine learning are significantly contributing to business, but they can be very sensitive to data and changes in it, which makes it very important to use proper techniques and avoid pitfalls in building data science models.
- Sound Data Science: Avoiding the Most Pernicious Prediction Pitfall - Jan 5, 2017.
Data science and predictive analytics can provide huge value, but they can mislead and backfire if not used with fail-safe measures. The author gives examples of such problems and provides guidelines to avoid them.
- Continuous improvement for IoT through AI / Continuous learning - Nov 25, 2016.
In reality, especially for IoT, it is not like once an analytics model is built, it will give the results with same accuracy till the end of time. Data pattern changes over the time which makes it absolutely important to learn from new data and improve/recalibrate the models to get correct result. Below article explain this phenomenon of continuous improvement in analytics for IoT.
- Understanding the Bias-Variance Tradeoff: An Overview - Aug 8, 2016.
A model's ability to minimize bias and minimize variance are often thought of as 2 opposing ends of a spectrum. Being able to understand these two types of errors are critical to diagnosing model results.
- How to Compute the Statistical Significance of Two Classifiers Performance Difference - Mar 30, 2016.
To determine whether a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. Here we are demonstrating how you can compute difference between two models using it.
- Big Idea To Avoid Overfitting: Reusable Holdout to Preserve Validity in Adaptive Data Analysis - Aug 17, 2015.
Big Data makes it all too easy find spurious "patterns" in data. A new approach helps avoid overfitting by using 2 key ideas: validation should not reveal any information about the holdout data, and adding of a small amount of noise to any validation result.
Pages: 1 2
- Overcoming Overfitting with the reusable holdout: Preserving validity in adaptive data analysis - Aug 12, 2015.
Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis.
- How to Lead a Data Science Contest without Reading the Data - May 17, 2015.
We examine a “wacky” boosting method that lets you climb the public leaderboard without even looking at the data . But there is a catch, so read on before trying to win Kaggle competitions with this approach.
- Failing Optimally – Data Science’s Measurement Problem - Mar 4, 2015.
Data science has a measurement problem. Simple metrics may not address complex situations. But complex metrics present myriad problems.
- Interpreting Model Performance with Cost Functions - Jan 13, 2014.
Cost functions are critical for the correct assessment of performance of data mining and predictive models. This series goes deep into the statistical properties and mathematical understanding of each cost function and explores their similarities and differences.