- Alternative Feature Selection Methods in Machine Learning - Dec 24, 2021.
Feature selection methodologies go beyond filter, wrapper and embedded methods. In this article, I describe 3 alternative algorithms to select predictive features based on a feature importance score.
Data Preparation, Feature Selection, Machine Learning, Python
- Feature Selection: Where Science Meets Art - Dec 14, 2021.
From heuristic to algorithmic feature selection techniques for data science projects.
Data Preprocessing, Feature Selection, Machine Learning, Statistics
- Be Wary of Automated Feature Selection — Chi Square Test of Independence Example - Aug 5, 2021.
When Data Scientists use chi square test for feature selection, they just merely go by the ritualistic “If your p-value is low, the null hypothesis must go”. The automated function they use behaves no differently.
Automated Data Science, Automated Machine Learning, Feature Selection, Statistics
- From Scratch: Permutation Feature Importance for ML Interpretability - Jun 30, 2021.
Use permutation feature importance to discover which features in your dataset are useful for prediction — implemented from scratch in Python.
Feature Selection, Interpretability, Machine Learning, Python
- Feature Selection – All You Ever Wanted To Know - Jun 10, 2021.
Although your data set may contain a lot of information about many different features, selecting only the "best" of these to be considered by a machine learning model can mean the difference between a model that performs well--with better performance, higher accuracy, and more computational efficiency--and one that falls flat. The process of feature selection guides you toward working with only the data that may be the most meaningful, and to accomplish this, a variety of feature selection types, methodologies, and techniques exist for you to explore.
Feature Engineering, Feature Selection, Machine Learning
- This Data Visualization is the First Step for Effective Feature Selection - Jun 8, 2021.
Understanding the most important features to use is crucial for developing a model that performs well. Knowing which features to consider requires experimentation, and proper visualization of your data can help clarify your initial selections. The scatter pairplot is a great place to start.
Data Visualization, Feature Selection, Statistics, Stocks
- What makes a song popular? Analyzing Top Songs on Spotify - Apr 16, 2021.
With so many great (and not-so-great) songs out there, it can be hard to find those that match your musical preferences. Follow along this ML model building project to explore the extensive song data available on Spotify and design a recommendation engine that could help you discover your next favorite artist!
Beatles, Data Analysis, Data Exploration, Feature Selection, Music, Spotify
- Why Automated Feature Selection Has Its Risks - Apr 13, 2021.
Theoretical relevance of features must not be ignored.
Automation, Data Science, Feature Selection
4 Machine Learning Concepts I Wish I Knew When I Built My First Model - Mar 9, 2021.
Diving into building your first machine learning model will be an adventure -- one in which you will learn many important lessons the hard way. However, by following these four tips, your first and subsequent models will be put on a path toward excellence.
Feature Selection, Gradio, Hyperparameter, Machine Learning, Metrics, Python
- Feature Ranking with Recursive Feature Elimination in Scikit-Learn - Oct 19, 2020.
This article covers using scikit-learn to obtain the optimal number of features for your machine learning project.
Feature Selection, Machine Learning, Python, scikit-learn
How I Consistently Improve My Machine Learning Models From 80% to Over 90% Accuracy - Sep 23, 2020.
Data science work typically requires a big lift near the end to increase the accuracy of any model developed. These five recommendations will help improve your machine learning models and help your projects reach their target goals.
Accuracy, Ensemble Methods, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Missing Values, Tips
- Getting Started with Feature Selection - Aug 25, 2020.
For machine learning, more data is always better. What about more features of data? Not necessarily. This beginners' guide with code examples for selecting the most useful features from your data will jump start you toward developing the most effective and efficient learning models.
Beginners, Data Preparation, Feature Selection
- The Architecture Used at LinkedIn to Improve Feature Management in Machine Learning Models - May 11, 2020.
The new typed feature schema streamlined the reusability of features across thousands of machine learning models.
Feature Engineering, Feature Selection, LinkedIn, Machine Learning
- Interpretability: Cracking open the black box, Part 2 - Dec 11, 2019.
The second part in a series on leveraging techniques to take a look inside the black box of AI, this guide considers post-hoc interpretation that is useful when the model is not transparent.
Explainability, Explainable AI, Feature Selection, Interpretability, Python
- 5 Great New Features in Latest Scikit-learn Release - Dec 10, 2019.
From not sweating missing values, to determining feature importance for any estimator, to support for stacking, and a new plotting API, here are 5 new features of the latest release of Scikit-learn which deserve your attention.
Data Preparation, Data Preprocessing, Ensemble Methods, Feature Selection, Gradient Boosting, K-nearest neighbors, Machine Learning, Missing Values, Python, scikit-learn, Visualization
- An Eight-Step Checklist for An Analytics Project - Nov 6, 2019.
Follow these eight headings of an audit sheet that business analysts should address before submitting the results of their analytics project. One recommended approach is to rewrite each step as a question, answer it, and then attach it to your project.
Analytics, Checklist, Deployment, Feature Selection, Statistics
- KDnuggets™ News 19:n41, Oct 30: Feature Selection: Beyond feature importance?; Time Series Analysis Using KNIME and Spark - Oct 30, 2019.
This week in KDnuggets: Feature Selection: Beyond feature importance?; Time Series Analysis: A Simple Example with KNIME and Spark; 5 Advanced Features of Pandas and How to Use Them; How to Measure Foot Traffic Using Data Analytics; Introduction to Natural Language Processing (NLP); and much, much more!
Apache Spark, Data Analytics, Feature Selection, Knime, NLP, Pandas, Python, scikit-learn, Time Series
- Feature Selection: Beyond feature importance? - Oct 24, 2019.
In this post, you will see 3 different techniques of how to do Feature Selection to your datasets and how to build an effective predictive model.
Feature Selection, Machine Learning
- Proptech and the proper use of technology for house sales prediction - Aug 22, 2019.
Using the ATTOM dataset, we extracted data on sales transactions in the USA, loans, and estimated values of property. We developed an optimal prediction model from correlations in the time and status of ownership as well as the time of the year of sales fluctuations.
Feature Selection, Predictive Analytics, Real Estate
- Feature selection by random search in Python - Aug 6, 2019.
Feature selection is one of the most important tasks in machine learning. Learn how to use a simple random search in Python to get good results in less time.
Collinearity, Cross-validation, Feature Selection, Python, Random
- Opening Black Boxes: How to leverage Explainable Machine Learning - Aug 1, 2019.
A machine learning model that predicts some outcome provides value. One that explains why it made the prediction creates even more value for your stakeholders. Learn how Interpretable and Explainable ML technologies can help while developing your model.
Explainable AI, Feature Selection, LIME, Machine Learning, SHAP, XAI
- The Hitchhiker’s Guide to Feature Extraction - Jun 3, 2019.
Check out this collection of tricks and code for Kaggle and everyday work.
Feature Engineering, Feature Extraction, Feature Selection, Kaggle, Python
7 Steps to Mastering Intermediate Machine Learning with Python — 2019 Edition - Jun 3, 2019.
This is the second part of this new learning path series for mastering machine learning with Python. Check out these 7 steps to help master intermediate machine learning with Python!
7 Steps, Classification, Cross-validation, Dimensionality Reduction, Feature Engineering, Feature Selection, Image Classification, K-nearest neighbors, Machine Learning, Modeling, Naive Bayes, numpy, Pandas, PCA, Python, scikit-learn, Transfer Learning
- A Quick Guide to Feature Engineering - Feb 11, 2019.
Feature engineering plays a key role in machine learning, data mining, and data analytics. This article provides a general definition for feature engineering, together with an overview of the major issues, approaches, and challenges of the field.
Feature Engineering, Feature Extraction, Feature Selection
- Implementing Automated Machine Learning Systems with Open Source Tools - Oct 25, 2018.
What if you want to implement an automated machine learning pipeline of your very own, or automate particular aspects of a machine learning pipeline? Rest assured that there is no need to reinvent any wheels.
Automated Machine Learning, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Open Source
- Step Forward Feature Selection: A Practical Example in Python - Jun 18, 2018.
When it comes to disciplined approaches to feature selection, wrapper methods are those which marry the feature selection process to the type of model being built, evaluating feature subsets in order to detect the model performance between features, and subsequently select the best performing subset.
Feature Selection, Machine Learning, Python
- How (dis)similar are my train and test data? - Jun 7, 2018.
This articles examines a scenario where your machine learning model can fail.
Data Science, Datasets, Feature Selection, Machine Learning, Training Data
- Multi-objective Optimization for Feature Selection - Dec 5, 2017.
By having the model analyze the important signals, we can focus on the right set of attributes for optimization. As a side effect, less attributes also mean that you can train your models faster, making them less complex and easier to understand.
Feature Selection, RapidMiner
- Evolutionary Algorithms for Feature Selection - Nov 29, 2017.
Feature selection is a very important technique in machine learning. In this post we discuss one of the most common optimization algorithms for multi-modal fitness landscapes - evolutionary algorithms.
Evolutionary Algorithm, Feature Selection, RapidMiner
Automated Feature Engineering for Time Series Data - Nov 20, 2017.
We introduce a general framework for developing time series models, generating features and preprocessing the data, and exploring the potential to automate this process in order to apply advanced machine learning algorithms to almost any time series problem.
Automated Machine Learning, Data Preparation, Feature Engineering, Feature Selection, Time Series
- Basic Concepts of Feature Selection - Nov 15, 2017.
Feature selection is a key part of data science but is it still relevant in the age of support vector machines (SVMs) and Deep Learning? Yes, absolutely. We explain why.
Feature Engineering, Feature Selection, RapidMiner
- KDnuggets™ News 17:n23, Jun 14: The Practice of Machine Learning, Data Science Implementation, and Feature Selection - Jun 14, 2017.
A Practical Guide to Machine Learning; Your Checklist to Get Data Science Implemented in Production; The Practical Importance of Feature Selection; Machine Learning in Real Life: Tales from the Trenches.
Feature Selection, Machine Learning, Production, Python
The Practical Importance of Feature Selection - Jun 12, 2017.
Feature selection is useful on a variety of fronts: it is the best weapon against the Curse of Dimensionality; it can reduce overall training times; and it is a powerful defense against overfitting, increasing generalizability.
Feature Selection, Machine Learning, Rubens Zimbres
- Must-Know: Why it may be better to have fewer predictors in Machine Learning models? - Apr 4, 2017.
There are a few reasons why it might be a better idea to have fewer predictor variables rather than having many of them. Read on to find out more.
Feature Selection, Interview Questions, Machine Learning, Modeling
- Kanri Distance Calculator(tm) – patented solution applying power of Big Data to an Individual - Mar 21, 2017.
Kanri combination of patented statistical and process methods provide a powerful ability to evaluate large data, tells users the exact distance from target, and variable contributions for participant. Free trial and 88% KDnuggets discount for the first 100 buyers.
Feature Selection, Kanri, Metrics
- 17 More Must-Know Data Science Interview Questions and Answers, Part 2 - Feb 22, 2017.
The second part of 17 new must-know Data Science Interview questions and answers covers overfitting, ensemble methods, feature selection, ground truth in unsupervised learning, the curse of dimensionality, and parallel algorithms.
Algorithms, Data Science, Ensemble Methods, Feature Engineering, Feature Selection, High-dimensional, Interview Questions, Overfitting, Unsupervised Learning
- Identifying Variables That Might Be Better Predictors - Feb 2, 2017.
This blog serves to expand on the approach that the data science team uses to identify (and quantify) which variables and metrics are better predictors of performance.
Data Science, Feature Selection, Prediction, Predictive Analytics
- Data Analytics Models in Quantitative Finance and Risk Management - Dec 13, 2016.
We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.
Data Analytics, Feature Selection, Finance, Regression, Risk Modeling
- Clustering Key Terms, Explained - Oct 18, 2016.
Getting started with Data Science or need a refresher? Clustering is among the most used tools of Data Scientists. Check out these 10 Clustering-related terms and their concise definitions.
Clustering, Explained, Feature Selection, K-means, Key Terms
- Data Mining Tip: How to Use High-cardinality Attributes in a Predictive Model - Aug 29, 2016.
High-cardinality nominal attributes can pose an issue for inclusion in predictive models. There exist a few ways to accomplish this, however, which are put forward here.
Feature Engineering, Feature Selection, Predictive Models
- MDL Clustering: Unsupervised Attribute Ranking, Discretization, and Clustering - Aug 26, 2016.
MDL Clustering is a free software suite for unsupervised attribute ranking, discretization, and clustering based on the Minimum Description Length principle and built on the Weka Data Mining platform.
Clustering, Feature Selection, Java, Unsupervised Learning, Weka
- Approaching (Almost) Any Machine Learning Problem - Aug 18, 2016.
If you're looking for an overview of how to approach (almost) any machine learning problem, this is a good place to start. Read on as a Kaggle competition veteran shares his pipelines and approach to problem-solving.
Pages: 1 2
Advice, Feature Selection, Kaggle, Machine Learning, Modeling
- Contest 2nd Place: Automating Data Science - Aug 3, 2016.
This post discusses some considerations, options, and opportunities for automating aspects of data science and machine learning. It is the second place recipient (tied) in the recent KDnuggets blog contest.
Algorithms, Automated, Automated Data Science, Feature Selection, Machine Learning
- And the Winner is… Stepwise Regression - Aug 1, 2016.
This post evaluates several methods for automating the feature selection process in large-scale linear regression models and show that for marketing applications the winner is Stepwise regression.
Automated Data Science, Feature Selection, Linear Regression, Machine Learning, Predictive Analytics
- Nutrition & Principal Component Analysis: A Tutorial - Jun 16, 2016.
A great overview of Principal Component Analysis (PCA), with an example application in the field of nutrition.
Pages: 1 2
Algobeans, Feature Selection, Food, Nutrition, PCA
- Data Science of Variable Selection: A Review - Jun 7, 2016.
There are as many approaches to selecting features as there are statisticians since every statistician and their sibling has a POV or a paper on the subject. This is an overview of some of these approaches.
Algorithms, Big Data, Feature Selection, Statistics
- scikit-feature: Open-Source Feature Selection Repository in Python - Mar 3, 2016.
scikit-feature is an open-source feature selection repository in python, with around 40 popular algorithms in feature selection research. It is developed by Data Mining and Machine Learning Lab at Arizona State University.
Data Mining, Data Science, Feature Extraction, Feature Selection, Machine Learning, Python