# Regression (92)

**11 Most Practical Data Science Skills for 2022**- Oct 19, 2021.

While the field of data science continues to evolve with exciting new progress in analytical approaches and machine learning, there remain a core set of skills that are foundational for all general practitioners and specialists, especially those who want to be employable with full-stack capabilities.**How causal inference lifts augmented analytics beyond flatland**- Aug 27, 2021.

In our quest to better understand and predict business outcomes, traditional predictive modeling tends to fall flat. However, causal inference techniques along with business analytics approaches can unravel what truly changes your KPIs.**30 Most Asked Machine Learning Questions Answered**- Aug 3, 2021.

There is always a lot to learn in machine learning. Whether you are new to the field or a seasoned practitioner and ready for a refresher, understanding these key concepts will keep your skills honed in the right direction.**Time Series Forecasting with PyCaret Regression Module**- Apr 21, 2021.

PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few lines only. See how to use PyCaret's Regression Module for Time Series Forecasting.**Data Science 101: Normalization, Standardization, and Regularization**- Apr 20, 2021.

Normalization, standardization, and regularization all sound similar. However, each plays a unique role in your data preparation and model building process, so you must know when and how to use these important procedures.**Deep Learning Is Becoming Overused**- Mar 29, 2021.

Understanding the data is the first port of call.**Metric Matters, Part 2: Evaluating Regression Models**- Mar 23, 2021.

In this second part review of the many options available for choosing metrics to evaluate machine learning models, learn how to select the most appropriate metric for your analysis of regression models.**Learning from machine learning mistakes**- Mar 19, 2021.

Read this article and discover how to find weak spots of a regression model.**All Machine Learning Algorithms You Should Know in 2021**- Jan 4, 2021.

Many machine learning algorithms exits that range from simple to complex in their approach, and together provide a powerful library of tools for analyzing and predicting patterns from data. If you are learning for the first time or reviewing techniques, then these intuitive explanations of the most popular machine learning models will help you kick off the new year with confidence.**Simple & Intuitive Ensemble Learning in R**- Dec 2, 2020.

Read about metaEnsembleR, an R package for heterogeneous ensemble meta-learning (classification and regression) that is fully-automated.**Simple Python Package for Comparing, Plotting & Evaluating Regression Models**- Nov 25, 2020.

This package is aimed to help users plot the evaluation metric graph with single line code for different widely used regression model metrics comparing them at a glance. With this utility package, it also significantly lowers the barrier for the practitioners to evaluate the different machine learning algorithms in an amateur fashion by applying it to their everyday predictive regression problems.**Learn to build an end to end data science project**- Nov 11, 2020.

Appreciating the process you must work through for any Data Science project is valuable before you land your first job in this field. With a well-honed strategy, such as the one outlined in this example project, you will remain productive and consistently deliver valuable machine learning models.**How to Explain Key Machine Learning Algorithms at an Interview**- Oct 19, 2020.

While preparing for interviews in Data Science, it is essential to clearly understand a range of machine learning models -- with a concise explanation for each at the ready. Here, we summarize various machine learning models by highlighting the main points to help you communicate complex models.**R squared Does Not Measure Predictive Capacity or Statistical Adequacy**- Jul 31, 2020.

The fact that R-squared shouldn't be used for deciding if you have an adequate model is counter-intuitive and is rarely explained clearly. This demonstration overviews how R-squared goodness-of-fit works in regression analysis and correlations, while showing why it is not a measure of statistical adequacy, so should not suggest anything about future predictive performance.**A Complete Guide To Survival Analysis In Python, part 3**- Jul 30, 2020.

Concluding this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter based on different groups, a Log-Rank test, and Cox Regression, all with examples and shared code.**Model Evaluation Metrics in Machine Learning**- May 28, 2020.

A detailed explanation of model evaluation metrics to evaluate a classification machine learning model.**5 Great New Features in Scikit-learn 0.23**- May 15, 2020.

Check out 5 new features of the latest Scikit-learn release, including the ability to visualize estimators in notebooks, improvements to both k-means and gradient boosting, some new linear model implementations, and sample weight support for a pair of existing regressors.**Beginners Guide to the Three Types of Machine Learning**- Nov 13, 2019.

The following article is an introduction to classification and regression — which are known as supervised learning — and unsupervised learning — which in the context of machine learning applications often refers to clustering — and will include a walkthrough in the popular python library scikit-learn.**Designing Your Neural Networks**- Nov 4, 2019.

Check out this step-by-step walk through of some of the more confusing aspects of neural nets to guide you to making smart decisions about your neural network architecture.**How Bad is Multicollinearity?**- Sep 17, 2019.

For some people anything below 60% is acceptable and for certain others, even a correlation of 30% to 40% is considered too high because it one variable may just end up exaggerating the performance of the model or completely messing up parameter estimates.**From Data Pre-processing to Optimizing a Regression Model Performance**- Jul 19, 2019.

All you need to know about data pre-processing, and how to build and optimize a regression model using Backward Elimination method in Python.**How do you check the quality of your regression model in Python?**- Jul 2, 2019.

Linear regression is rooted strongly in the field of statistical learning and therefore the model must be checked for the ‘goodness of fit’. This article shows you the essential steps of this task in a Python ecosystem.**Separating signal from noise**- Jun 4, 2019.

When we are building a model, we are making the assumption that our data has two parts, signal and noise. Signal is the real pattern, the repeatable process that we hope to capture and describe. The noise is everything else that gets in the way of that.**Choosing Between Model Candidates**- May 29, 2019.

Models are useful because they allow us to generalize from one situation to another. When we use a model, we’re working under the assumption that there is some underlying pattern we want to measure, but it has some error on top of it.**Top Data Science and Machine Learning Methods Used in 2018, 2019**- Apr 29, 2019.

Once again, the most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests. The greatest relative increases this year are overwhelmingly Deep Learning techniques, while SVD, SVMs and Association Rules show the greatest decline.**7 Steps to Mastering Basic Machine Learning with Python — 2019 Edition**- Jan 29, 2019.

With a new year upon us, I thought it would be a good time to revisit the concept and put together a new learning path for mastering machine learning with Python. With these 7 steps you can master basic machine learning with Python!**KDnuggets™ News 18:n30, Aug 8: Iconic Data Visualisation; Data Scientist Interviews Demystified; Simple Statistics in Python**- Aug 8, 2018.

Also: Selecting the Best Machine Learning Algorithm for Your Regression Problem; From Data to Viz: how to select the the right chart for your data; Only Numpy: Implementing GANs and Adam Optimizer using Numpy; Programming Best Practices for Data Science**Autoregressive Models in TensorFlow**- Aug 6, 2018.

This article investigates autoregressive models in TensorFlow, including autoregressive time series and predictions with the actual observations.**Selecting the Best Machine Learning Algorithm for Your Regression Problem**- Aug 1, 2018.

This post should then serve as a great aid in selecting the best ML algorithm for you regression problem!**Deep Quantile Regression**- Jul 3, 2018.

Most Deep Learning frameworks currently focus on giving a best estimate as defined by a loss function. Occasionally something beyond a point estimate is required to make a decision. This is where a distribution would be useful. This article will purely focus on inferring quantiles.**Data Science Predicting The Future**- Jun 19, 2018.

In this article we will expand on the knowledge learnt from the last article - The What, Where and How of Data for Data Science - and consider how data science is applied to predict the future.**Choosing the Right Metric for Evaluating Machine Learning Models – Part 1**- Apr 27, 2018.

Each machine learning model is trying to solve a problem with a different objective using a different dataset and hence, it is important to understand the context before choosing a metric.**Ten Machine Learning Algorithms You Should Know to Become a Data Scientist**- Apr 11, 2018.

It's important for data scientists to have a broad range of knowledge, keeping themselves updated with the latest trends. With that being said, we take a look at the top 10 machine learning algorithms every data scientist should know.**Logistic Regression: A Concise Technical Overview**- Feb 16, 2018.

Interested in learning the concepts behind Logistic Regression (LogR)? Looking for a concise introduction to LogR? This article is for you. Includes a Python implementation and links to an R script as well.**Which Machine Learning Algorithm be used in year 2118?**- Feb 9, 2018.

So what were the answers popping in your head ? Random forest, SVM, K means, Knn or even Deep Learning? No, for the answer, we turn to Lindy Effect.**Topological Data Analysis for Data Professionals: Beyond Ayasdi**- Jan 16, 2018.

We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.**Top KDnuggets tweets, Dec 06-12: Top #DataScience and #MachineLearning Methods Used in 2017; Geoff Hinton Capsule Networks – a new way for machines to see**- Dec 13, 2017.

Also The first international #beauty contest decided by #AI #algorithm sparked controversy; 4 Common #Data Fallacies That You Need To Know; Using #DeepLearning to Solve Real World Problems; Best Online Masters in #DataScience and #Analytics.**Top Data Science and Machine Learning Methods Used in 2017**- Dec 11, 2017.

The most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests; Deep Learning is used by only 20% of respondents; we also analyze which methods are most "industrial" and most "academic".**5 Tricks When A/B Testing Is Off The Table**- Dec 8, 2017.

Sometimes you cannot do A/B testing, but it does not mean we have to fly blind - there is a range of econometric methods that can illuminate the causal relationships at play.**How Bayesian Networks Are Superior in Understanding Effects of Variables**- Nov 9, 2017.

Bayes Nets have remarkable properties that make them better than many traditional methods in determining variables’ effects. This article explains the principle advantages.**3 different types of machine learning**- Nov 1, 2017.

In this extract from “Python Machine Learning” a top data scientist Sebastian Raschka explains 3 main types of machine learning: Supervised, Unsupervised and Reinforcement Learning. Use code PML250KDN to save 50% off the book cost.**Top 6 errors novice machine learning engineers make**- Oct 30, 2017.

What common mistakes beginners do when working on machine learning or data science projects? Here we present list of such most common errors.**It Only Takes One Line of Code to Run Regression**- Oct 19, 2017.

I learned how important to understand data before running algorithms, how important it is to know the context and the industry before jumping on getting insights, how it is very easy to make models but tough to get them to work for you, and finally, how it only takes one line of code to run linear regression on your dataset.**Learn Generalized Linear Models (GLM) using R**- Oct 11, 2017.

In this article, we aim to discuss various GLMs that are widely used in the industry. We focus on: a) log-linear regression b) interpreting log-transformations and c) binary logistic regression.**Learn How to Make Machine Learning Work (webinars every Tue in October, Live or on-demand)**- Sep 28, 2017.

To fully use machine learning, we first need to understand both the potential benefits and the techniques to create data-driven models. In this webinar series, we will show you how to easily and automatically apply complex algorithms to data in real world applications.**KDnuggets™ News 17:n22, Jun 7: 7 Steps to Mastering Data Preparation with Python; Why Does Deep Learning Not Have a Local Minimum?**- Jun 7, 2017.

7 Steps to Mastering Data Preparation with Python; Why Does Deep Learning Not Have a Local Minimum?; 7 Techniques to Handle Imbalanced Data; Which Machine Learning Algorithm Should I Use?; Is Regression Analysis Really Machine Learning?**Is Regression Analysis Really Machine Learning?**- Jun 5, 2017.

What separates "traditional" applied statistics from machine learning? Is statistics the foundation on top of which machine learning is built? Is machine learning a superset of "traditional" statistics? Do these 2 concepts have a third unifying concept in common? So, in that vein... is regression analysis actually a form of machine learning?**Machine Learning Crash Course: Part 1**- May 24, 2017.

This post, the first in a series of ML tutorials, aims to make machine learning accessible to anyone willing to learn. We’ve designed it to give you a solid understanding of how ML algorithms work as well as provide you the knowledge to harness it in your projects.**The Data Science of Steel, or Data Factory to Help Steel Factory**- Apr 25, 2017.

Applying Machine Learning to steel production is really hard! Here are some lessons from Yandex researchers on how to balance the need for findings to be accurate, useful, and understandable at the same time.**Building Regression Models in R using Support Vector Regression**- Mar 8, 2017.

The article studies the advantage of Support Vector Regression (SVR) over Simple Linear Regression (SLR) models for predicting real values, using the same basic idea as Support Vector Machines (SVM) use for classification.**The Gentlest Introduction to Tensorflow – Part 3**- Feb 21, 2017.

This post is the third entry in a series dedicated to introducing newcomers to TensorFlow in the gentlest possible manner. This entry progresses to multi-feature linear regression.**Webinar: Improve Your Regression with CART and Gradient Boosting, Feb 16**- Feb 13, 2017.

Learn about a powerful tree-based machine learning algorithm called gradient boosting, which often outperforms linear regression, Random Forests, and CART.**Regression Analysis: A Primer**- Feb 6, 2017.

Despite the popularity of Regression, it is also misunderstood. Why? The answer might surprise you: There is no such thing as Regression. Rather, there are a large number of statistical methods that are called Regression, all of which are based on a shared statistical foundation.**Top /r/MachineLearning Posts, December: OpenAI Universe; Deep Learning MOOC For Coders; Musk: Tesla Gets Awesome-er**- Jan 5, 2017.

OpenAI Universe; Deep Learning For Coders—18 hours of lessons for free; Elon Musk on Twitter: Tesla Autopilot vision neural net now working well; Apple to Start Publishing AI Research; Duolingo's "half-life regression" method for modeling human memory**Data Science Basics: What Types of Patterns Can Be Mined From Data?**- Dec 14, 2016.

Why do we mine data? This post is an overview of the types of patterns that can be gleaned from data mining, and some real world examples of said patterns.**Data Analytics Models in Quantitative Finance and Risk Management**- Dec 13, 2016.

We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.**The Great Algorithm Tutorial Roundup**- Sep 20, 2016.

This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!**Top Algorithms and Methods Used by Data Scientists**- Sep 12, 2016.

Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms.**A Primer on Logistic Regression – Part I**- Aug 24, 2016.

Gain an understanding of logistic regression - what it is, and when and how to use it - in this post.**A Neat Trick to Increase Robustness of Regression Models**- Aug 22, 2016.

Read this take on the validity of choosing a different approach to regression modeling. Why isn't L1 norm used more often?**What Statistics Topics are Needed for Excelling at Data Science?**- Aug 2, 2016.

Here is a list of skills and statistical concepts suggested for excelling at data science, roughly in order of increasing complexity.**Improve Your Regression with Modern Regression Analysis Techniques, July 27, Aug 10 Webinars**- Jul 22, 2016.

This two part webinar will help you improve your regression using modern regression analysis techniques. July 27 (part 1) and August 10 (part 2).**A Brief Primer on Linear Regression – Part III**- Jul 5, 2016.

This third part of an introduction to linear regression moves past the topics covered in the first to discuss linearity, normality, outliers, and other topics of interest.**What is Softmax Regression and How is it Related to Logistic Regression?**- Jul 1, 2016.

An informative exploration of softmax regression and its relationship with logistic regression, and situations in which each would be applicable.**Regularization in Logistic Regression: Better Fit and Better Generalization?**- Jun 24, 2016.

A discussion on regularization in logistic regression, and how its usage plays into better model fit and generalization.**A Brief Primer on Linear Regression – Part 2**- Jun 13, 2016.

This second part of an introduction to linear regression moves past the topics covered in the first to discuss linearity, normality, outliers, and other topics of interest.**A Brief Primer on Linear Regression – Part 1**- Jun 6, 2016.

This introduction to linear regression discusses a simple linear regression model with one predictor variable, and then extends it to the multiple linear regression model with at least two predictors.**Regression & Correlation for Military Promotion: A Tutorial**- Apr 13, 2016.

A clear and well-written tutorial covering the concepts of regression and correlation, focusing on military commander promotion as a use case.**Salford Predictive Modeler 8: Faster. More Machine Learning. Better results**- Apr 4, 2016.

Take a giant step forward with SPM 8: Download and try it for yourself just released version 8 and get better results.**New Salford Predictive Modeler 8**- Mar 1, 2016.

Salford Predictive Modeler software suite: Faster. More Comprehensive Machine Learning. More Automation. Better results. Take a giant step forward in your data science productivity with SPM 8. Download and try it today!**Top KDnuggets tweets, Feb 15-21: Is Big Data Still a Thing? 10 types of #regression. Which one to use?**- Feb 24, 2016.

10 types of #regression. Which one to use? Is Big Data Still a Thing? 2016 #BigData Landscape; Demystifying #DeepReinforcement Learning; #TextMining #SouthPark.**Jan 27 Webinar: 3 Ways to Improve your Regression, Part 2**- Jan 26, 2016.

How to take data science techniques even further to extract actionable insight and take advantage of advanced modeling features. You will walk away with several different methods to turn your ordinary regression into an extraordinary regression!**3 Ways to Improve your Regression, Jan 20 & 27 Webinars, Hands-on**- Jan 12, 2016.

Instead of proceeding with a mediocre analysis, join us for this 2-part webinar series. We will show you how modern algorithms can take your regression model to the next level and expertly handle your modeling woes**What questions can data science answer?**- Jan 1, 2016.

There are only five questions machine learning can answer: Is this A or B? Is this weird? How much/how many? How is it organized? What should I do next? We examine these questions in detail and what it implies for data science.**Statistical Learning and Data Mining: 10 Hot Ideas for Learning from Data, NYC, Oct 8-9**- Aug 27, 2015.

Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.**Cloud Machine Learning Wars: Amazon vs IBM Watson vs Microsoft Azure**- Apr 16, 2015.

Amazon recently announced Amazon Machine Learning, a cloud machine learning solution for Amazon Web Services. Able to pull data effortlessly from RDS, S3 and Redshift, the product could pose a significant threat to Microsoft Azure ML and IBM Watson Analytics.**Machine Learning 201: Does Balancing Classes Improve Classifier Performance?**- Apr 9, 2015.

The author investigates if balancing classes improves performance for logistic regression, SVM, and Random Forests, and finds where it helps the performance and where it does not.**7 common mistakes when doing Machine Learning**- Mar 7, 2015.

In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.**Statistical Learning and Data Mining III: 10 Hot Ideas for Learning from Data, Mar 19-20, Palo Alto**- Feb 23, 2015.

Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.**Upcoming Webcasts on Analytics, Big Data, Data Science – Feb 10 and beyond**- Feb 9, 2015.

Data Mining: Failure to Launch, 3 Ways to Improve your Regression, The Pragmatic Text Miner, Make It Big As a Data Scientist in 2015, Managing Big Data in Production and more.**Avoiding a Common Mistake with Time Series**- Feb 2, 2015.

We explore a common mistake in analyzing relationships between time series, and show how de-trending helps to avoid this error.**Fundamental methods of Data Science: Classification, Regression And Similarity Matching**- Jan 12, 2015.

Data classification, regression, and similarity matching underpin many of the fundamental algorithms in data science to solve business problems like consumer response prediction and product recommendation.**Enter a KDD Cup or Kaggle Competition. You don’t need to be an expert!**- Jan 4, 2015.

The webinar will show on the example of KDD Cup 2009 how Salford TreeNet can quickly achieve a top 5 result, and how to quickly build great models even if you are not an expert.**Data Analytics for Business Leaders Explained**- Sep 22, 2014.

Learn about a variety of different approaches to data analytics and their advantages and limitations from a business leader's perspective in part 1 of this post on data analytics techniques.**Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 16 and beyond**- Sep 16, 2014.

NASA Earth Science, Modern Regression Analysis, Strata + Hadoop NYC preview, Data Mining: Failure To Launch, Data Science with R, Not All Graph Databases Are Created Equal, and more.**Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 9 and beyond**- Sep 9, 2014.

Modern Regression Analysis, Tamr and Forrester Research, Hadoop for Machine Learning, NASA Earth Science Data, Strata + Hadoop NYC preview, and more.**Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 2 and beyond**- Sep 1, 2014.

Streaming Analytics, Analytical Lifecycle, Modern Regression Analysis, Hadoop for Machine Learning, NASA Earth Science Data, Strata + Hadoop NYC preview, Ontotext, and more.**Upcoming Webcasts on Analytics, Big Data, Data Science – Aug 19 and beyond**- Aug 18, 2014.

Data Mining: Failure To Launch, Smart Metering, Hadoop, Data Science at the Command Line, Personalized Healthcare, Modern Regression Analysis Techniques, and more.**Top KDnuggets tweets, Jul 21-22**- Jul 23, 2014.

Microsoft: Data Scientist; Haskell Data Analysis Cookbook - A practical and concise guide; Large collection of papers on #Security and Machine Learning; Learn Data Science in 12 wks + career coaching.**DataScience Central competition: Automate jackknife regression**- Apr 3, 2014.

Data Science Central holds a competition to get statisticians more involved in Data Science - create a black-box, automated, easy-to-interpret, sample-based, robust technique called jackknife regression.**To Fit or Not to Fit Data to a Model**- Jan 23, 2014.

What if Shakespeare was a data scientist? Today's big data necessitates - Let the data define the model.**SLDM Statistical Learning and Data Mining III – 10 Hot Ideas, Palo Alto, Mar 20-21**- Jan 23, 2014.

Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.