- Essential Math for Data Science: Integrals And Area Under The Curve - Nov 25, 2020.
In this article, you’ll learn about integrals and the area under the curve using the practical data science example of the area under the ROC curve used to compare the performances of two machine learning models.
- KDnuggets™ News 20:n04, Jan 29: AutoML: If you try it, you’ll like it more; The Data Science Interview Study Guide - Jan 29, 2020.
AutoML Poll results: if you try it, you'll like it more; The Data Science Interview Study Guide; What Do Data Scientists in Europe Do & How Much Are They Worth?; 2 Questions for a Junior Data Scientist
- The 5 Most Useful Techniques to Handle Imbalanced Datasets - Jan 22, 2020.
This post is about explaining the various techniques you can use to handle imbalanced datasets.
- Classify A Rare Event Using 5 Machine Learning Algorithms - Jan 15, 2020.
Which algorithm works best for unbalanced data? Are there any tradeoffs?
- Pro Tips: How to deal with Class Imbalance and Missing Labels - Nov 20, 2019.
Your spectacularly-performing machine learning model could be subject to the common culprits of class imbalance and missing labels. Learn how to handle these challenges with techniques that remain open areas of new research for addressing real-world machine learning problems.
- KDnuggets™ News 19:n19, May 15: Data Scientist – Best Job of the Year!; How (not) to use Machine Learning for time series forecasting - May 15, 2019.
"Please, explain." Interpretability of machine learning models; How to fix an Unbalanced Dataset; Data Science Poem; Customer Churn Prediction Using Machine Learning; A Complete Exploratory Data Analysis and Visualization for Text
- How to fix an Unbalanced Dataset - May 8, 2019.
We explain several alternative ways to handle imbalanced datasets, including different resampling and ensembling methods with code examples.
- Three techniques to improve machine learning model performance with imbalanced datasets - Jun 5, 2018.
The primary objective of this project was to handle data imbalance issue. In the following subsections, I describe three techniques I used to overcome the data imbalance problem.
- Applying Deep Learning to Real-world Problems - Jun 30, 2017.
In this blog post I shared three learnings that are important to us at Merantix when applying deep learning to real-world problems. I hope that these ideas are helpful for other people who plan to use deep learning in their business.
- KDnuggets™ News 17:n22, Jun 7: 7 Steps to Mastering Data Preparation with Python; Why Does Deep Learning Not Have a Local Minimum? - Jun 7, 2017.
7 Steps to Mastering Data Preparation with Python; Why Does Deep Learning Not Have a Local Minimum?; 7 Techniques to Handle Imbalanced Data; Which Machine Learning Algorithm Should I Use?; Is Regression Analysis Really Machine Learning?
- 7 Techniques to Handle Imbalanced Data - Jun 1, 2017.
This blog post introduces seven techniques that are commonly applied in domains like intrusion detection or real-time bidding, because the datasets are often extremely imbalanced.
- The Best Metric to Measure Accuracy of Classification Models - Dec 7, 2016.
Measuring accuracy of model for a classification problem (categorical output) is complex and time consuming compared to regression problems (continuous output). Let’s understand key testing metrics with example, for a classification problem.
Pages: 1 2
- KDnuggets™ News 16:n16, May 4: How to Remove Duplicates from Large Data; Datasets over Algorithms; When Automation goes too far - May 4, 2016.
How to Remove Duplicates in Large Datasets; The Development of Classification as a Learning Machine; Datasets Over Algorithms; Cartoon: When Automation Goes Too Far, and more.
- Dealing with Unbalanced Classes, SVMs, Random Forests®, and Decision Trees in Python - Apr 29, 2016.
An overview of dealing with unbalanced classes, and implementing SVMs, Random Forests, and Decision Trees in Python.
Pages: 1 2 3