# Tag: Statistics (241)

**The Best Free Data Science eBooks: 2020 Update**- Sep 30, 2020.

The author has updated their list of best free data science books for 2020. Read on to see what books you should grab.**Causal Inference: The Free eBook**- Sep 25, 2020.

Here's another free eBook for those looking to up their skills. If you are seeking a resource that exhaustively treats the topic of causal inference, this book has you covered.**What is Simpson’s Paradox and How to Automatically Detect it**- Sep 18, 2020.

Looking at data one way can tell one story, but sometimes looking at it another way will tell the opposite story. Understanding this paradox and why it happens is essential, and new tools are available to help automatically detect this tricky issue in your datasets.**Statistics with Julia: The Free eBook**- Sep 14, 2020.

This free eBook is a draft copy of the upcoming Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial Intelligence. Interested in learning Julia for data science? This might be the best intro out there.**Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills**- Sep 8, 2020.

We analyze the results of the Data Science Skills poll, including 8 categories of skills, 13 core skills that over 50% of respondents have, the emerging/hot skills that data scientists want to learn, and what is the top skill that Data Scientists want to learn.**Book Chapter: The Art of Statistics: Learning from Data**- Sep 3, 2020.

Get a free book chapter from "The Art of Statistics: Learning from Data" by a leading researcher Sir David John Spiegelhalter. This excerpt takes a forensic look at data surrounding the victims of the UK most prolific serial killer and shows how a simple search for patterns reveals critical details.**Which methods should be used for solving linear regression?**- Sep 2, 2020.

As a foundational set of algorithms in any machine learning toolbox, linear regression can be solved with a variety of approaches. Here, we discuss. with with code examples, four methods and demonstrate how they should be used.**These Data Science Skills will be your Superpower**- Aug 20, 2020.

Learning data science means learning the hard skills of statistics, programming, and machine learning. To complete your training, a broader set of soft skills will round out your capabilities as an effective and successful professional Data Scientist.**KDnuggets™ News 20:n32, Aug 19: The List of Top 10 Data Science Lists; Data Science MOOCs with Substance**- Aug 19, 2020.

The List of Top 10 Lists in Data Science; Going Beyond Superficial: Data Science MOOCs with Substance; Introduction to Statistics for Data Science; Content-Based Recommendation System using Word Embeddings; How Natural Language Processing Is Changing Data Analytics**Hypothesis Test for Real Problems**- Aug 14, 2020.

Hypothesis tests are significant for evaluating answers to questions concerning samples of data.**Introduction to Statistics for Data Science**- Aug 12, 2020.

Statistics is foundational for Data Science and a crucial skill to master for any practitioner. This advanced introduction reviews with examples the fundamental concepts of inferential statistics by illustrating the differences between Point Estimators and Confidence Intervals Estimates.**R squared Does Not Measure Predictive Capacity or Statistical Adequacy**- Jul 31, 2020.

The fact that R-squared shouldn't be used for deciding if you have an adequate model is counter-intuitive and is rarely explained clearly. This demonstration overviews how R-squared goodness-of-fit works in regression analysis and correlations, while showing why it is not a measure of statistical adequacy, so should not suggest anything about future predictive performance.**A Complete Guide To Survival Analysis In Python, part 3**- Jul 30, 2020.

Concluding this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter based on different groups, a Log-Rank test, and Cox Regression, all with examples and shared code.**Essential Resources to Learn Bayesian Statistics**- Jul 28, 2020.

If you are interesting in becoming better at statistics and machine learning, then some time should be invested in diving deeper into Bayesian Statistics. While the topic is more advanced, applying these fundamentals to your work will advance your understanding and success as an ML expert.**Demystifying Statistical Significance**- Jul 17, 2020.

With more professionals from a wide range of less technical fields diving into statistical analysis and data modeling, these experimental techniques can seem daunting. To help with these hurdles, this article clarifies some misconceptions around p-values, hypothesis testing, and statistical significance.**Before Probability Distributions**- Jul 16, 2020.

Why do we use probability distributions, and why do they matter?**A Complete Guide To Survival Analysis In Python, part 2**- Jul 14, 2020.

Continuing with the second of this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter theory as well as the Nelson-Aalen fitter theory, both with examples and shared code.**A Complete Guide To Survival Analysis In Python, part 1**- Jul 7, 2020.

This three-part series covers a review with step-by-step explanations and code for how to perform statistical survival analysis used to investigate the time some event takes to occur, such as patient survival during the COVID-19 pandemic, the time to failure of engineering products, or even the time to closing a sale after an initial customer contact.**The 8 Basic Statistics Concepts for Data Science**- Jun 24, 2020.

Understanding the fundamentals of statistics is a core capability for becoming a Data Scientist. Review these essential ideas that will be pervasive in your work and raise your expertise in the field.**4 Free Math Courses to do and Level up your Data Science Skills**- Jun 22, 2020.

Just as there is no Data Science without data, there's no science in data without mathematics. Strengthening your foundational skills in math will level you up as a data scientist that will enable you to perform with greater expertise.**Overview of data distributions**- Jun 10, 2020.

With so many types of data distributions to consider in data science, how do you choose the right one to model your data? This guide will overview the most important distributions you should be familiar with in your work.**KDnuggets™ News 20:n23, Jun 10: Largest Dataset you analyzed? If you start statistics all over again, where would you start? GPT-3**- Jun 10, 2020.

#BlackLivesMatter. In this issue: If you had to start statistics all over again, where would you start? New Poll: What was the largest dataset you analyzed? Another Great NLP Course from Stanford; Naive Bayes: Everything you need to know; GPT-3 - a giant leap for Deep Learning and NLP?**If you had to start statistics all over again, where would you start?**- Jun 5, 2020.

If you are just diving into learning statistics, then where do you begin? Find insight from those who have tread in these waters before, and see what they might have done differently along their personal journeys in statistics.**STIPS – Statistical Thinking for Industrial Problem Solving – A free online statistics course**- Jun 2, 2020.

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.**Appropriately Handling Missing Values for Statistical Modelling and Prediction**- May 22, 2020.

Many statisticians in industry agree that blindly imputing the missing values in your dataset is a dangerous move and should be avoided without first understanding why the data is missing in the first place.**Looking Normal(ly Distributed)**- May 20, 2020.

This article investigates when some probability distributions look normal "enough" for a statistical test.**Evidence Counterfactuals for explaining predictive models on Big Data**- May 18, 2020.

Big Data generated by people -- such as, social media posts, mobile phone GPS locations, and browsing history -- provide enormous prediction value for AI systems. However, explaining how these models predict with the data remains challenging. This interesting explanation approach considers how a model would behave if it didn't have the original set of data to work with.**Were 21% of New York City residents really infected with the novel coronavirus?**- May 6, 2020.

Understanding the types of statistical bias that pop up in popular media and reporting is especially important during this pandemic where the data -- and our global response to the data -- directly impact peoples' lives.**Statistical Thinking for Industrial Problem Solving – a free online statistics course**- May 5, 2020.

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.**A Concise Course in Statistical Inference: The Free eBook**- Apr 27, 2020.

Check out this freely available book, All of Statistics: A Concise Course in Statistical Inference, and learn the probability and statistics needed for success in data science.**Should Data Scientists Model COVID19 and other Biological Events**- Apr 22, 2020.

Biostatisticians use statistical techniques that your current everyday data scientists have probably never heard of. This is a great example where lack of domain knowledge exposes you as someone that does not know what they are doing and are merely hopping on a trend.**Statistical Thinking for Industrial Problem Solving – a free online statistics course**- Apr 9, 2020.

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.**Free online statistics course – Improve your analytics knowledge**- Mar 26, 2020.

This online course is available – for free – to anyone interested in using data to solve problems better.**Data Science Curriculum for self-study**- Feb 26, 2020.

Are you asking the question, "how do I become a Data Scientist?" This list recommends the best essential topics to gain an introductory understanding for getting started in the field. After learning these basics, keep in mind that doing real data science projects through internships or competitions is crucial to acquiring the core skills necessary for the job.**Statistical Thinking for Industrial Problem Solving: a free online course.**- Jan 13, 2020.

**Statistical Thinking for Industrial Problem Solving: a free online course**- Dec 3, 2019.

**An Eight-Step Checklist for An Analytics Project**- Nov 6, 2019.

Follow these eight headings of an audit sheet that business analysts should address before submitting the results of their analytics project. One recommended approach is to rewrite each step as a question, answer it, and then attach it to your project.**KDnuggets™ News 19:n42, Nov 6: 5 Statistical Traps Data Scientists Should Avoid; 10 Free Must-Read Books on AI**- Nov 6, 2019.

Learn about statistical fallacies Data Scientists should avoid; New and quite amazing Deep Learning capabilities FB has been quietly open-sourcing; Top Machine Learning tools for Developers; How to build a Neural Network from scratch and more.**Probability Learning: Maximum Likelihood**- Nov 5, 2019.

The maths behind Bayes will be better understood if we first cover the theory and maths underlying another fundamental method of probabilistic machine learning: Maximum Likelihood. This post will be dedicated to explaining it.**5 Statistical Traps Data Scientists Should Avoid**- Oct 30, 2019.

Here are five statistical fallacies — data traps — which data scientists should be aware of and definitely avoid.**How to Become a (Good) Data Scientist – Beginner Guide**- Oct 16, 2019.

A guide covering the things you should learn to become a data scientist, including the basics of business intelligence, statistics, programming, and machine learning.**An Overview of Density Estimation**- Oct 14, 2019.

Density estimation is estimating the probability density function of the population from the sample. This post examines and compares a number of approaches to density estimation.**Statistical Thinking for Industrial Problem Solving: a free online course**- Oct 2, 2019.

**6 bits of advice for Data Scientists**- Sep 25, 2019.

As a data scientist, you can get lost in your daily dives into the data. Consider this advice to be certain to follow in your work for being diligent and more impactful for your organization.**Beta Distribution: What, When & How**- Sep 25, 2019.

This article covers the beta distribution, and explains it using baseball batting averages.**Which Data Science Skills are core and which are hot/emerging ones?**- Sep 17, 2019.

We identify two main groups of Data Science skills: A: 13 core, stable skills that most respondents have and B: a group of hot, emerging skills that most do not have (yet) but want to add. See our detailed analysis.**How Bad is Multicollinearity?**- Sep 17, 2019.

For some people anything below 60% is acceptable and for certain others, even a correlation of 30% to 40% is considered too high because it one variable may just end up exaggerating the performance of the model or completely messing up parameter estimates.**What’s the difference between analytics and statistics?**- Sep 6, 2019.

From asking the best questions about data to answering those questions with certainty, understanding the value of these two seemingly different professions is clarified when you see how they should work together.**Statistical Modelling vs Machine Learning**- Aug 14, 2019.

At times it may seem Machine Learning can be done these days without a sound statistical background but those people are not really understanding the different nuances. Code written to make it easier does not negate the need for an in-depth understanding of the problem.**What is Poisson Distribution?**- Aug 14, 2019.

An solid overview of the Poisson distribution, starting from why it is needed, how it stacks up to binomial distribution, deriving its formula mathematically, and more.**Statistical Thinking for Industrial Problem Solving (STIPS) – a free online course.**- Aug 2, 2019.

**P-values Explained By Data Scientist**- Jul 30, 2019.

This article is designed to give you a full picture from constructing a hypothesis testing to understanding p-value and using that to guide our decision making process.**Annotated Heatmaps of a Correlation Matrix in 5 Simple Steps**- Jul 9, 2019.

A heatmap is a graphical representation of data in which data values are represented as colors. That is, it uses color in order to communicate a value to the reader. This is a great tool to assist the audience towards the areas that matter the most when you have a large volume of data.**How do you check the quality of your regression model in Python?**- Jul 2, 2019.

Linear regression is rooted strongly in the field of statistical learning and therefore the model must be checked for the ‘goodness of fit’. This article shows you the essential steps of this task in a Python ecosystem.**Top KDnuggets Tweets, Jun 12 – 18: The biggest mistake while learning #Python for #datascience; 5 practical statistical concepts for data scientists**- Jun 19, 2019.

Also: Resources for developers transitioning into data science; Best Data Visualization Techniques for small and large data; Top Data Science and Machine Learning Methods Used in 2018, 2019**KDnuggets™ News 19:n23, Jun 19: Useful Stats for Data Scientists; Python, TensorFlow & R Winners in Latest Job Report**- Jun 19, 2019.

This week on KDnuggets: 5 Useful Statistics Data Scientists Need to Know; Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS; How to Learn Python for Data Science the Right Way; The Machine Learning Puzzle, Explained; Scalable Python Code with Pandas UDFs; and much more!**5 Useful Statistics Data Scientists Need to Know**- Jun 14, 2019.

A data scientist should know how to effectively use statistics to gain insights from data. Here are five useful and practical statistical concepts that every data scientist must know.**All Models Are Wrong – What Does It Mean?**- Jun 12, 2019.

During your adventures in data science, you may have heard “all models are wrong.” Let’s unpack this famous quote to understand how we can still make models that are useful.**Top 10 Statistics Mistakes Made by Data Scientists**- Jun 7, 2019.

The following are some of the most common statistics mistakes made by data scientists. Check this list often to make sure you are not making any of these while applying statistics to data science.**Statistical Thinking for Industrial Problem Solving (STIPS): a free online course.**- Jun 4, 2019.

**Separating signal from noise**- Jun 4, 2019.

When we are building a model, we are making the assumption that our data has two parts, signal and noise. Signal is the real pattern, the repeatable process that we hope to capture and describe. The noise is everything else that gets in the way of that.**What Does a Lady Tasting Tea Have to Do with Science?**- May 31, 2019.

Design of Experiments (DOE) is a statistical concept used to find the cause-and-effect relationships. Surprisingly, an experiment arising from a casual conversation about tea-drinking is one of the first examples of an experiment designed using statistical ideas.**Probability Mass and Density Functions**- May 21, 2019.

This content is part of a series about the chapter 3 on probability from the Deep Learning Book by Goodfellow, I., Bengio, Y., and Courville, A. (2016). It aims to provide intuitions/drawings/python code on mathematical theories and is constructed as my understanding of these concepts.**Modeling 101**- May 13, 2019.

In the past couple of decades, innovation in statistics and machine learning has been increasing at a rapid pace and we are now able to do things unimaginable when I began my career.**Naive Bayes: A Baseline Model for Machine Learning Classification Performance**- May 7, 2019.

We can use Pandas to conduct Bayes Theorem and Scikitlearn to implement the Naive Bayes Algorithm. We take a step by step approach to understand Bayes and implementing the different options in Scikitlearn.**Statistical Thinking for Industrial Problem Solving (STIPS) – a free online course.**- May 3, 2019.

**How to correctly select a sample from a huge dataset in machine learning**- May 1, 2019.

We explain how choosing a small, representative dataset from a large population can improve model training reliability.**Statistical Thinking for Industrial Problem Solving (STIPS) – a free online course**- Apr 5, 2019.

**Spatio-Temporal Statistics: A Primer**- Apr 5, 2019.

Marketing scientist Kevin Gray asks University of Missouri Professor Chris Wikle about Spatio-Temporal Statistics and how it can be used in science and business.**Wake Forest University: Teaching Professor/Professor of the Practice in Statistics/Analytics [Winston-Salem, NC]**- Mar 18, 2019.

The Wake Forest University School of Business is seeking qualified candidates for a Teaching Professor/Professor of the Practice in Statistics/Analytics. This individual will be expected to teach graduate courses in areas such as Data Analysis & Business Modeling, Data Mining & Machine Learning, and Forecasting.**The 7 Myths of Data Anonymisation**- Mar 12, 2019.

Anonymisation has always been rather seen as a necessary evil instead of a helpful tool. That’s why plenty of myths have arisen around that technology over the years.**Beating the Bookies with Machine Learning**- Mar 8, 2019.

We investigate how to use a custom loss function to identify fair odds, including a detailed example using machine learning to bet on the results of a darts match and how this can assist you in beating the bookmaker.**Statistical Thinking for Industrial Problem Solving – a free online course**- Feb 6, 2019.

**From Good to Great Data Science, Part 1: Correlations and Confidence**- Feb 5, 2019.

With the aid of some hospital data, part one describes how just a little inexperience in statistics could result in two common mistakes.**The Essential Data Science Venn Diagram**- Feb 4, 2019.

A deeper examination of the interdisciplinary interplay involved in data science, focusing on automation, validity and intuition.**Southern Illinois University Edwardsville: Director of the Center for Predictive Analytics/(Associate) Professor of Mathematics and Statistics [Edwardsville, IL]**- Jan 4, 2019.

Southern Illinois University Edwardsville (SIUE) is establishing the Center for Predictive Analytics (C-PAN), and is seeking an innovative, visionary director for the center who will provide centralized leadership in establishing research and educational initiatives across academic units at SIUE.**Introduction to Statistics for Data Science**- Dec 17, 2018.

This tutorial helps explain the central limit theorem, covering populations and samples, sampling distribution, intuition, and contains a useful video so you can continue your learning.**A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more**- Dec 7, 2018.

A thorough collection of useful resources covering statistics, classic machine learning, deep learning, probability, reinforcement learning, and more.**The 5 Basic Statistics Concepts Data Scientists Need to Know**- Nov 13, 2018.

Today, we’re going to look at 5 basic statistics concepts that data scientists need to know and how they can be applied most effectively!**Quantum Machine Learning: A look at myths, realities, and future projections**- Nov 5, 2018.

An overview of quantum computing and quantum algorithm design, including current state of the hardware and algorithm design within the existing systems.**How I Learned to Stop Worrying and Love Uncertainty**- Oct 24, 2018.

This is a written version of Data Scientist Adolfo Martínez’s talk at Software Guru’s DataDay 2017. There is a link to the original slides (in Spanish) at the top of this post.**University of San Francisco: Assistant Professor, Tenure Track, Mathematics and Statistics [San Francisco, CA]**- Oct 17, 2018.

The University of San Francisco invites applications for a tenure-track Assistant Professor position to begin August 2019. We seek well-qualified candidates in the areas of applied mathematics or statistics, with a focus on the extraction of knowledge from data.**Mindstrong Health: Sr Data Scientist / Machine Learning, Statistics, Coding [Palo Alto, CA]**- Oct 17, 2018.

Mindstrong Health is seeking a Sr Data Scientist in Palo Alto, CA, who is passionate about our mission, committed to excellence and excited to build a company that will address one of the greatest health challenges of our time.**Unfolding Naive Bayes From Scratch**- Sep 25, 2018.

Whether you are a beginner in Machine Learning or you have been trying hard to understand the Super Natural Machine Learning Algorithms and you still feel that the dots do not connect somehow, this post is definitely for you!**Machine Learning Cheat Sheets**- Sep 11, 2018.

Check out this collection of machine learning concept cheat sheets based on Stanord CS 229 material, including supervised and unsupervised learning, neural networks, tips & tricks, probability & stats, and algebra & calculus.**5 Things to Know About A/B Testing**- Sep 7, 2018.

This article presents 5 things to know about A/B testing, from appropriate sample sizes, to statistical confidence, to A/B testing usefulness, and more.**Essential Math for Data Science: ‘Why’ and ‘How’**- Sep 6, 2018.

It always pays to know the machinery under the hood (even at a high level) than being just the guy behind the wheel with no knowledge about the car.**What on earth is data science?**- Sep 4, 2018.

An overview and discussion around data science, covering the history behind the term, data mining, statistical inference, machine learning, data engineering and more.**Basic Statistics in Python: Probability**- Aug 21, 2018.

At the most basic level, probability seeks to answer the question, "What is the chance of an event happening?" To calculate the chance of an event happening, we also need to consider all the other events that can occur.**Interpreting a data set, beginning to end**- Aug 20, 2018.

Detailed knowledge of your data is key to understanding it! We review several important methods that to understand the data, including summary statistics with visualization, embedding methods like PCA and t-SNE, and Topological Data Analysis.**Top KDnuggets tweets, Aug 1-14: Basic Statistics in Python; Essential Command Line Tools for Data Scientists**- Aug 15, 2018.

Basic Statistics in Python: Descriptive Statistics; Top 12 Essential Command Line Tools for Data Scientists; WTF is a Tensor?!?; How GOAT Taught a Machine to Love Sneakers;**KDnuggets™ News 18:n30, Aug 8: Iconic Data Visualisation; Data Scientist Interviews Demystified; Simple Statistics in Python**- Aug 8, 2018.

Also: Selecting the Best Machine Learning Algorithm for Your Regression Problem; From Data to Viz: how to select the the right chart for your data; Only Numpy: Implementing GANs and Adam Optimizer using Numpy; Programming Best Practices for Data Science**Basic Statistics in Python: Descriptive Statistics**- Aug 1, 2018.

This article covers defining statistics, descriptive statistics, measures of central tendency, and measures of spread. This article assumes no prior knowledge of statistics, but does require at least a general knowledge of Python.**What is Normal?**- Jul 31, 2018.

I saw an article recently that referred to the normal curve as the data scientist's best friend. We examine myths around the normal curve, including - is most data normally distributed?**Causation in a Nutshell**- Jul 20, 2018.

Every move we make, every breath we take, and every heartbeat is an effect that is caused. Even apparent randomness may just be something we cannot explain.**Explaining the 68-95-99.7 rule for a Normal Distribution**- Jul 19, 2018.

This post explains how those numbers were derived in the hope that they can be more interpretable for your future endeavors.**Why Data Scientists Love Gaussian**- Jun 26, 2018.

Gaussian distribution model, often identified with its iconic bell shaped curve, also referred as Normal distribution, is so popular mainly because of three reasons.**Every time someone runs a correlation coefficient on two time series, an angel loses their wings**- Jun 18, 2018.

We all know correlation doesn’t equal causality at this point, but when working with time series data, correlation can lead you to come to the wrong conclusion.**Statistics, Causality, and What Claims are Difficult to Swallow: Judea Pearl debates Kevin Gray**- Jun 15, 2018.

While KDnuggets takes no side, we present the informative and respectful back and forth as we believe it has value for our readers. We hope that you agree.**A Better Stats 101**- Jun 12, 2018.

Statistics encourages us to think systemically and recognize that variables normally do not operate in isolation, and that an effect usually has multiple causes. Some call this multivariate thinking. Statistics is particularly useful for uncovering the Why.**The Statistics of Gang Violence**- Jun 6, 2018.

For Carlos Carcach, Professor & Director, Center for Public Policy at the Escuela Superior de Economía y Negocios (ESEN) in Santa Tecla, El Salvador, gangs are an object of intellectual curiosity and the subject of his research.**Football World Cup 2018 Predictions: Germany vs Brazil in the final, and more**- Jun 5, 2018.

Looking ahead to the FIFA World Cup that kicks off this month (14th June), we have created the official KDnuggets predictions.**The Book of Why**- Jun 1, 2018.

Judea Pearl has made noteworthy contributions to artificial intelligence, Bayesian networks, and causal analysis. These achievements notwithstanding, Pearl holds some views many statisticians may find odd or exaggerated.**Frequentists Fight Back**- May 24, 2018.

Frequentist methods are sometimes described as “classical”, though most have only appeared in recent decades and new ones are under development as you read this. Whatever we call it, this branch of statistics is very much alive.**24houranswers: Analytics / Data Science / Math / Statistics Tutors**- May 9, 2018.

Seeking qualified Ph.D. students or faculty members for the position of Tutor/Instructor to provide one-on-one lectures to the needs of our students in Applied Analytics, Computer Science, Applied Math and Statistics, and more.**Skewness vs Kurtosis – The Robust Duo**- May 4, 2018.

Kurtosis and Skewness are very close relatives of the “data normalized statistical moment” family – Kurtosis being the fourth and Skewness the third moment, and yet they are often used to detect very different phenomena in data. At the same time, it is typically recommendable to analyse the outputs of both together to gather more insight and understand the nature of the data better.**Key Algorithms and Statistical Models for Aspiring Data Scientists**- Apr 16, 2018.

This article provides a summary of key algorithms and statistical techniques commonly used in industry, along with a short resource related to these techniques.**Descriptive Statistics: The Mighty Dwarf of Data Science – Crest Factor**- Apr 6, 2018.

No other mean of data description is more comprehensive than Descriptive Statistics and with the ever increasing volumes of data and the era of low latency decision making needs, its relevance will only continue to increase.**Descriptive Statistics: The Mighty Dwarf of Data Science**- Mar 20, 2018.

No other mean of data description is more comprehensive than Descriptive Statistics and with the ever increasing volumes of data and the era of low latency decision making needs, its relevance will only continue to increase.**Madrid Advanced Statistics and Data Mining Summer School**- Mar 19, 2018.

The courses cover topics such as Neural Networks and Deep Learning, Bayesian Networks, Big Data with Apache Spark, Bayesian Inference, Text Mining and Time Series. Each course has theoretical and practical classes, the latter done with R or Python.**Multiscale Methods and Machine Learning**- Mar 19, 2018.

We highlight recent developments in machine learning and Deep Learning related to multiscale methods, which analyze data at a variety of scales to capture a wider range of relevant features. We give a general overview of multiscale methods, examine recent successes, and compare with similar approaches.**A Few Statistics Tips for Marketers**- Mar 6, 2018.

Statistics can help good marketers become better marketers. Here are a few things they should know about stats.**Histogram 202: Tips and Tricks for Better Data Science**- Feb 15, 2018.

We show how to make an ideal histogram, share some tips, and give examples. Let's dive into the world of binning.**Propensity Score Matching in R**- Jan 18, 2018.

Propensity scores are an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible.**How Not To Lie With Statistics**- Jan 11, 2018.

Darrell Huff's classic How to Lie with Statistics is perhaps more relevant than ever. In this short article, I revisit this theme from some different angles.**Robust Algorithms for Machine Learning**- Dec 11, 2017.

This post mentions some of the advantages of implementing robust, non-parametric methods into our Machine Learning frameworks and models.**5 Tricks When A/B Testing Is Off The Table**- Dec 8, 2017.

Sometimes you cannot do A/B testing, but it does not mean we have to fly blind - there is a range of econometric methods that can illuminate the causal relationships at play.**KDnuggets™ News 17:n45, Nov 29: New Poll: Data Science Methods Used? Deep Learning Specialization: 21 Lessons Learned**- Nov 29, 2017.

Also The 10 Statistical Techniques Data Scientists Need to Master; Did Spark Really Kill Hadoop? A Framework for Textual Data Science.**You have created your first Linear Regression Model. Have you validated the assumptions?**- Nov 15, 2017.

Linear Regression is an excellent starting point for Machine Learning, but it is a common mistake to focus just on the p-values and R-Squared values while determining validity of model. Here we examine the underlying assumptions of a Linear Regression, which need to be validated before applying the model.**The 10 Statistical Techniques Data Scientists Need to Master**- Nov 15, 2017.

The author presents 10 statistical techniques which a data scientist needs to master. Build up your toolbox of data science tools by having a look at this great overview post.**How Bayesian Networks Are Superior in Understanding Effects of Variables**- Nov 9, 2017.

Bayes Nets have remarkable properties that make them better than many traditional methods in determining variables’ effects. This article explains the principle advantages.**Conjoint Analysis: A Primer**- Nov 1, 2017.

Conjoint is another of those things everyone talks about but many are confused about…**Monty Hall chooses the final exit door**- Oct 7, 2017.

Monty Hall, the game show host, died last week. He was the host of the popular show "Let's Make a Deal", where contestants try to guess which one of 3 doors hides a valuable prize.**Statistical Mistakes Even Scientists Make**- Oct 3, 2017.

Scientists are all experts in statistics, right? Wrong.**30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets**- Sep 22, 2017.

This collection of data science cheat sheets is not a cheat sheet dump, but a curated list of reference materials spanning a number of disciplines and tools.**How To Lie With Numbers**- Sep 21, 2017.

It takes less effort to lie without numbers, but there are now more numbers and more ways to lie with them than ever before. Poor Reverend Bayes, who understood the true meaning of "evidence".**Vital Statistics You Never Learned… Because They’re Never Taught**- Aug 29, 2017.

Marketing scientist Kevin Gray asks Professor Frank Harrell about some important things we often get wrong about statistics.**Machine Learning vs. Statistics: The Texas Death Match of Data Science**- Aug 23, 2017.

Throughout its history, Machine Learning (ML) has coexisted with Statistics uneasily, like an ex-boyfriend accidentally seated with the groom’s family at a wedding reception: both uncertain where to lead the conversation, but painfully aware of the potential for awkwardness.**Data Science Primer: Basic Concepts for Beginners**- Aug 11, 2017.

This collection of concise introductory data science tutorials cover topics including the difference between data mining and statistics, supervised vs. unsupervised learning, and the types of patterns we can mine from data.**Analytically Speaking Featuring Pedro Saraiva, July 12**- Jul 7, 2017.

Former academician and now Portugal MP Pedro Saraiva says that Parliaments and societies will improve if more people with a good statistical background become MP. Learn about the paradoxes and issues in statistics and politics.