# Tag: Regression (80)

**R squared Does Not Measure Predictive Capacity or Statistical Adequacy**- Jul 31, 2020.

The fact that R-squared shouldn't be used for deciding if you have an adequate model is counter-intuitive and is rarely explained clearly. This demonstration overviews how R-squared goodness-of-fit works in regression analysis and correlations, while showing why it is not a measure of statistical adequacy, so should not suggest anything about future predictive performance.**A Complete Guide To Survival Analysis In Python, part 3**- Jul 30, 2020.

Concluding this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter based on different groups, a Log-Rank test, and Cox Regression, all with examples and shared code.**Model Evaluation Metrics in Machine Learning**- May 28, 2020.

A detailed explanation of model evaluation metrics to evaluate a classification machine learning model.**5 Great New Features in Scikit-learn 0.23**- May 15, 2020.

Check out 5 new features of the latest Scikit-learn release, including the ability to visualize estimators in notebooks, improvements to both k-means and gradient boosting, some new linear model implementations, and sample weight support for a pair of existing regressors.**Beginners Guide to the Three Types of Machine Learning**- Nov 13, 2019.

The following article is an introduction to classification and regression — which are known as supervised learning — and unsupervised learning — which in the context of machine learning applications often refers to clustering — and will include a walkthrough in the popular python library scikit-learn.**Designing Your Neural Networks**- Nov 4, 2019.

Check out this step-by-step walk through of some of the more confusing aspects of neural nets to guide you to making smart decisions about your neural network architecture.**How Bad is Multicollinearity?**- Sep 17, 2019.

For some people anything below 60% is acceptable and for certain others, even a correlation of 30% to 40% is considered too high because it one variable may just end up exaggerating the performance of the model or completely messing up parameter estimates.**From Data Pre-processing to Optimizing a Regression Model Performance**- Jul 19, 2019.

All you need to know about data pre-processing, and how to build and optimize a regression model using Backward Elimination method in Python.**How do you check the quality of your regression model in Python?**- Jul 2, 2019.

Linear regression is rooted strongly in the field of statistical learning and therefore the model must be checked for the ‘goodness of fit’. This article shows you the essential steps of this task in a Python ecosystem.**Separating signal from noise**- Jun 4, 2019.

When we are building a model, we are making the assumption that our data has two parts, signal and noise. Signal is the real pattern, the repeatable process that we hope to capture and describe. The noise is everything else that gets in the way of that.**Choosing Between Model Candidates**- May 29, 2019.

Models are useful because they allow us to generalize from one situation to another. When we use a model, we’re working under the assumption that there is some underlying pattern we want to measure, but it has some error on top of it.**Top Data Science and Machine Learning Methods Used in 2018, 2019**- Apr 29, 2019.

Once again, the most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests. The greatest relative increases this year are overwhelmingly Deep Learning techniques, while SVD, SVMs and Association Rules show the greatest decline.**7 Steps to Mastering Basic Machine Learning with Python — 2019 Edition**- Jan 29, 2019.

With a new year upon us, I thought it would be a good time to revisit the concept and put together a new learning path for mastering machine learning with Python. With these 7 steps you can master basic machine learning with Python!**KDnuggets™ News 18:n30, Aug 8: Iconic Data Visualisation; Data Scientist Interviews Demystified; Simple Statistics in Python**- Aug 8, 2018.

Also: Selecting the Best Machine Learning Algorithm for Your Regression Problem; From Data to Viz: how to select the the right chart for your data; Only Numpy: Implementing GANs and Adam Optimizer using Numpy; Programming Best Practices for Data Science**Autoregressive Models in TensorFlow**- Aug 6, 2018.

This article investigates autoregressive models in TensorFlow, including autoregressive time series and predictions with the actual observations.**Selecting the Best Machine Learning Algorithm for Your Regression Problem**- Aug 1, 2018.

This post should then serve as a great aid in selecting the best ML algorithm for you regression problem!**Deep Quantile Regression**- Jul 3, 2018.

Most Deep Learning frameworks currently focus on giving a best estimate as defined by a loss function. Occasionally something beyond a point estimate is required to make a decision. This is where a distribution would be useful. This article will purely focus on inferring quantiles.**Data Science Predicting The Future**- Jun 19, 2018.

In this article we will expand on the knowledge learnt from the last article - The What, Where and How of Data for Data Science - and consider how data science is applied to predict the future.**Choosing the Right Metric for Evaluating Machine Learning Models – Part 1**- Apr 27, 2018.

Each machine learning model is trying to solve a problem with a different objective using a different dataset and hence, it is important to understand the context before choosing a metric.**Ten Machine Learning Algorithms You Should Know to Become a Data Scientist**- Apr 11, 2018.

It's important for data scientists to have a broad range of knowledge, keeping themselves updated with the latest trends. With that being said, we take a look at the top 10 machine learning algorithms every data scientist should know.**Logistic Regression: A Concise Technical Overview**- Feb 16, 2018.

Interested in learning the concepts behind Logistic Regression (LogR)? Looking for a concise introduction to LogR? This article is for you. Includes a Python implementation and links to an R script as well.**Which Machine Learning Algorithm be used in year 2118?**- Feb 9, 2018.

So what were the answers popping in your head ? Random forest, SVM, K means, Knn or even Deep Learning? No, for the answer, we turn to Lindy Effect.**Topological Data Analysis for Data Professionals: Beyond Ayasdi**- Jan 16, 2018.

We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.**Top KDnuggets tweets, Dec 06-12: Top #DataScience and #MachineLearning Methods Used in 2017; Geoff Hinton Capsule Networks – a new way for machines to see**- Dec 13, 2017.

Also The first international #beauty contest decided by #AI #algorithm sparked controversy; 4 Common #Data Fallacies That You Need To Know; Using #DeepLearning to Solve Real World Problems; Best Online Masters in #DataScience and #Analytics.**Top Data Science and Machine Learning Methods Used in 2017**- Dec 11, 2017.

The most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests; Deep Learning is used by only 20% of respondents; we also analyze which methods are most "industrial" and most "academic".**5 Tricks When A/B Testing Is Off The Table**- Dec 8, 2017.

Sometimes you cannot do A/B testing, but it does not mean we have to fly blind - there is a range of econometric methods that can illuminate the causal relationships at play.**How Bayesian Networks Are Superior in Understanding Effects of Variables**- Nov 9, 2017.

Bayes Nets have remarkable properties that make them better than many traditional methods in determining variables’ effects. This article explains the principle advantages.**3 different types of machine learning**- Nov 1, 2017.

In this extract from “Python Machine Learning” a top data scientist Sebastian Raschka explains 3 main types of machine learning: Supervised, Unsupervised and Reinforcement Learning. Use code PML250KDN to save 50% off the book cost.**Top 6 errors novice machine learning engineers make**- Oct 30, 2017.

What common mistakes beginners do when working on machine learning or data science projects? Here we present list of such most common errors.**It Only Takes One Line of Code to Run Regression**- Oct 19, 2017.

I learned how important to understand data before running algorithms, how important it is to know the context and the industry before jumping on getting insights, how it is very easy to make models but tough to get them to work for you, and finally, how it only takes one line of code to run linear regression on your dataset.**Learn Generalized Linear Models (GLM) using R**- Oct 11, 2017.

In this article, we aim to discuss various GLMs that are widely used in the industry. We focus on: a) log-linear regression b) interpreting log-transformations and c) binary logistic regression.**Learn How to Make Machine Learning Work (webinars every Tue in October, Live or on-demand)**- Sep 28, 2017.

To fully use machine learning, we first need to understand both the potential benefits and the techniques to create data-driven models. In this webinar series, we will show you how to easily and automatically apply complex algorithms to data in real world applications.**KDnuggets™ News 17:n22, Jun 7: 7 Steps to Mastering Data Preparation with Python; Why Does Deep Learning Not Have a Local Minimum?**- Jun 7, 2017.

7 Steps to Mastering Data Preparation with Python; Why Does Deep Learning Not Have a Local Minimum?; 7 Techniques to Handle Imbalanced Data; Which Machine Learning Algorithm Should I Use?; Is Regression Analysis Really Machine Learning?**Is Regression Analysis Really Machine Learning?**- Jun 5, 2017.

What separates "traditional" applied statistics from machine learning? Is statistics the foundation on top of which machine learning is built? Is machine learning a superset of "traditional" statistics? Do these 2 concepts have a third unifying concept in common? So, in that vein... is regression analysis actually a form of machine learning?**Machine Learning Crash Course: Part 1**- May 24, 2017.

This post, the first in a series of ML tutorials, aims to make machine learning accessible to anyone willing to learn. We’ve designed it to give you a solid understanding of how ML algorithms work as well as provide you the knowledge to harness it in your projects.**The Data Science of Steel, or Data Factory to Help Steel Factory**- Apr 25, 2017.

Applying Machine Learning to steel production is really hard! Here are some lessons from Yandex researchers on how to balance the need for findings to be accurate, useful, and understandable at the same time.**Building Regression Models in R using Support Vector Regression**- Mar 8, 2017.

The article studies the advantage of Support Vector Regression (SVR) over Simple Linear Regression (SLR) models for predicting real values, using the same basic idea as Support Vector Machines (SVM) use for classification.**The Gentlest Introduction to Tensorflow – Part 3**- Feb 21, 2017.

This post is the third entry in a series dedicated to introducing newcomers to TensorFlow in the gentlest possible manner. This entry progresses to multi-feature linear regression.**Webinar: Improve Your Regression with CART and Gradient Boosting, Feb 16**- Feb 13, 2017.

Learn about a powerful tree-based machine learning algorithm called gradient boosting, which often outperforms linear regression, Random Forests, and CART.**Regression Analysis: A Primer**- Feb 6, 2017.

Despite the popularity of Regression, it is also misunderstood. Why? The answer might surprise you: There is no such thing as Regression. Rather, there are a large number of statistical methods that are called Regression, all of which are based on a shared statistical foundation.**Top /r/MachineLearning Posts, December: OpenAI Universe; Deep Learning MOOC For Coders; Musk: Tesla Gets Awesome-er**- Jan 5, 2017.

OpenAI Universe; Deep Learning For Coders—18 hours of lessons for free; Elon Musk on Twitter: Tesla Autopilot vision neural net now working well; Apple to Start Publishing AI Research; Duolingo's "half-life regression" method for modeling human memory**Data Science Basics: What Types of Patterns Can Be Mined From Data?**- Dec 14, 2016.

Why do we mine data? This post is an overview of the types of patterns that can be gleaned from data mining, and some real world examples of said patterns.**Data Analytics Models in Quantitative Finance and Risk Management**- Dec 13, 2016.

We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.**The Great Algorithm Tutorial Roundup**- Sep 20, 2016.

This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!**Top Algorithms and Methods Used by Data Scientists**- Sep 12, 2016.

Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms.**A Primer on Logistic Regression – Part I**- Aug 24, 2016.

Gain an understanding of logistic regression - what it is, and when and how to use it - in this post.**A Neat Trick to Increase Robustness of Regression Models**- Aug 22, 2016.

Read this take on the validity of choosing a different approach to regression modeling. Why isn't L1 norm used more often?**What Statistics Topics are Needed for Excelling at Data Science?**- Aug 2, 2016.

Here is a list of skills and statistical concepts suggested for excelling at data science, roughly in order of increasing complexity.**Improve Your Regression with Modern Regression Analysis Techniques, July 27, Aug 10 Webinars**- Jul 22, 2016.

This two part webinar will help you improve your regression using modern regression analysis techniques. July 27 (part 1) and August 10 (part 2).**A Brief Primer on Linear Regression – Part III**- Jul 5, 2016.

This third part of an introduction to linear regression moves past the topics covered in the first to discuss linearity, normality, outliers, and other topics of interest.**What is Softmax Regression and How is it Related to Logistic Regression?**- Jul 1, 2016.

An informative exploration of softmax regression and its relationship with logistic regression, and situations in which each would be applicable.**Regularization in Logistic Regression: Better Fit and Better Generalization?**- Jun 24, 2016.

A discussion on regularization in logistic regression, and how its usage plays into better model fit and generalization.**A Brief Primer on Linear Regression – Part 2**- Jun 13, 2016.

This second part of an introduction to linear regression moves past the topics covered in the first to discuss linearity, normality, outliers, and other topics of interest.**A Brief Primer on Linear Regression – Part 1**- Jun 6, 2016.

This introduction to linear regression discusses a simple linear regression model with one predictor variable, and then extends it to the multiple linear regression model with at least two predictors.**Machine Learning Key Terms, Explained**- May 25, 2016.

An overview of 12 important machine learning concepts, presented in a no frills, straightforward definition style.**Regression & Correlation for Military Promotion: A Tutorial**- Apr 13, 2016.

A clear and well-written tutorial covering the concepts of regression and correlation, focusing on military commander promotion as a use case.**Salford Predictive Modeler 8: Faster. More Machine Learning. Better results**- Apr 4, 2016.

Take a giant step forward with SPM 8: Download and try it for yourself just released version 8 and get better results.**New Salford Predictive Modeler 8**- Mar 1, 2016.

Salford Predictive Modeler software suite: Faster. More Comprehensive Machine Learning. More Automation. Better results. Take a giant step forward in your data science productivity with SPM 8. Download and try it today!**Top KDnuggets tweets, Feb 15-21: Is Big Data Still a Thing? 10 types of #regression. Which one to use?**- Feb 24, 2016.

10 types of #regression. Which one to use? Is Big Data Still a Thing? 2016 #BigData Landscape; Demystifying #DeepReinforcement Learning; #TextMining #SouthPark.**Jan 27 Webinar: 3 Ways to Improve your Regression, Part 2**- Jan 26, 2016.

How to take data science techniques even further to extract actionable insight and take advantage of advanced modeling features. You will walk away with several different methods to turn your ordinary regression into an extraordinary regression!**3 Ways to Improve your Regression, Jan 20 & 27 Webinars, Hands-on**- Jan 12, 2016.

Instead of proceeding with a mediocre analysis, join us for this 2-part webinar series. We will show you how modern algorithms can take your regression model to the next level and expertly handle your modeling woes**What questions can data science answer?**- Jan 1, 2016.

There are only five questions machine learning can answer: Is this A or B? Is this weird? How much/how many? How is it organized? What should I do next? We examine these questions in detail and what it implies for data science.**Statistical Learning and Data Mining: 10 Hot Ideas for Learning from Data, NYC, Oct 8-9**- Aug 27, 2015.

Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.**Cloud Machine Learning Wars: Amazon vs IBM Watson vs Microsoft Azure**- Apr 16, 2015.

Amazon recently announced Amazon Machine Learning, a cloud machine learning solution for Amazon Web Services. Able to pull data effortlessly from RDS, S3 and Redshift, the product could pose a significant threat to Microsoft Azure ML and IBM Watson Analytics.**Machine Learning 201: Does Balancing Classes Improve Classifier Performance?**- Apr 9, 2015.

The author investigates if balancing classes improves performance for logistic regression, SVM, and Random Forests, and finds where it helps the performance and where it does not.**7 common mistakes when doing Machine Learning**- Mar 7, 2015.

In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.**Statistical Learning and Data Mining III: 10 Hot Ideas for Learning from Data, Mar 19-20, Palo Alto**- Feb 23, 2015.

Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.**Upcoming Webcasts on Analytics, Big Data, Data Science – Feb 10 and beyond**- Feb 9, 2015.

Data Mining: Failure to Launch, 3 Ways to Improve your Regression, The Pragmatic Text Miner, Make It Big As a Data Scientist in 2015, Managing Big Data in Production and more.**Avoiding a Common Mistake with Time Series**- Feb 2, 2015.

We explore a common mistake in analyzing relationships between time series, and show how de-trending helps to avoid this error.**Fundamental methods of Data Science: Classification, Regression And Similarity Matching**- Jan 12, 2015.

Data classification, regression, and similarity matching underpin many of the fundamental algorithms in data science to solve business problems like consumer response prediction and product recommendation.**Enter a KDD Cup or Kaggle Competition. You don’t need to be an expert!**- Jan 4, 2015.

The webinar will show on the example of KDD Cup 2009 how Salford TreeNet can quickly achieve a top 5 result, and how to quickly build great models even if you are not an expert.**Data Analytics for Business Leaders Explained**- Sep 22, 2014.

Learn about a variety of different approaches to data analytics and their advantages and limitations from a business leader's perspective in part 1 of this post on data analytics techniques.**Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 16 and beyond**- Sep 16, 2014.

NASA Earth Science, Modern Regression Analysis, Strata + Hadoop NYC preview, Data Mining: Failure To Launch, Data Science with R, Not All Graph Databases Are Created Equal, and more.**Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 9 and beyond**- Sep 9, 2014.

Modern Regression Analysis, Tamr and Forrester Research, Hadoop for Machine Learning, NASA Earth Science Data, Strata + Hadoop NYC preview, and more.**Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 2 and beyond**- Sep 1, 2014.

Streaming Analytics, Analytical Lifecycle, Modern Regression Analysis, Hadoop for Machine Learning, NASA Earth Science Data, Strata + Hadoop NYC preview, Ontotext, and more.**Upcoming Webcasts on Analytics, Big Data, Data Science – Aug 19 and beyond**- Aug 18, 2014.

Data Mining: Failure To Launch, Smart Metering, Hadoop, Data Science at the Command Line, Personalized Healthcare, Modern Regression Analysis Techniques, and more.**Top KDnuggets tweets, Jul 21-22**- Jul 23, 2014.

Microsoft: Data Scientist; Haskell Data Analysis Cookbook - A practical and concise guide; Large collection of papers on #Security and Machine Learning; Learn Data Science in 12 wks + career coaching.**DataScience Central competition: Automate jackknife regression**- Apr 3, 2014.

Data Science Central holds a competition to get statisticians more involved in Data Science - create a black-box, automated, easy-to-interpret, sample-based, robust technique called jackknife regression.**To Fit or Not to Fit Data to a Model**- Jan 23, 2014.

What if Shakespeare was a data scientist? Today's big data necessitates - Let the data define the model.**SLDM Statistical Learning and Data Mining III – 10 Hot Ideas, Palo Alto, Mar 20-21**- Jan 23, 2014.

Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.