- 5 Must-Read Data Science Papers (and How to Use Them) - Oct 20, 2020.
Keeping ahead of the latest developments in a field is key to advancing your skills and your career. Five foundational ideas from recent data science papers are highlighted here with tips on how to leverage these advancements in your work, and keep you on top of the machine learning game.
- Hypothesis Test for Real Problems - Aug 14, 2020.
Hypothesis tests are significant for evaluating answers to questions concerning samples of data.
- Demystifying Statistical Significance - Jul 17, 2020.
With more professionals from a wide range of less technical fields diving into statistical analysis and data modeling, these experimental techniques can seem daunting. To help with these hurdles, this article clarifies some misconceptions around p-values, hypothesis testing, and statistical significance.
- P-values Explained By Data Scientist - Jul 30, 2019.
This article is designed to give you a full picture from constructing a hypothesis testing to understanding p-value and using that to guide our decision making process.
- Comparing Machine Learning Models: Statistical vs. Practical Significance - Jan 18, 2019.
Is model A or B more accurate? Hmm… In this blog post, I’d love to share my recent findings on model comparison.
- Data Scientist Interviews Demystified - Aug 2, 2018.
We look at typical questions in a data science interview, examine the rationale for such questions, and hope to demystify the interview process for recent graduates and aspiring data scientists.
- How to Compare Apples and Oranges ? : Part III - Jul 6, 2016.
In the previous article, look at techniques to compare categorical variables with the help of an example. In this article, we shall look at techniques to compare mixed type of variables i.e. numerical and categorical variables together.
- How to Compare Apples and Oranges, Part 2 – Categorical Variables - Jun 21, 2016.
In the previous article, we looked at some of the ways to compare different numerical variables. In this article, we shall look at techniques to compare categorical variables with the help of an example.
Pages: 1 2
- Do You Need Big Data or Smart Data? Part 1 - Jun 1, 2016.
Analyzing Big Data without paying attention to its characteristics and objective can be detrimental, the fix for which can be correct and effective sampling. Read on to transform your Big Data to Smart Data.
- The “Thinking” Part of “Thinking Like A Data Scientist” - Apr 26, 2016.
People have a tendency to blindly trust claims from any source that they deem credible, whether or not it conflicts with their own experiences or common sense. Basic stats - common sense = dangerous conclusions viewed as fact.
- Top KDnuggets tweets, Mar 16-21: After 150 Years, ASA Says “NO” to p-values; Free Resources to Learn #MachineLearning - Mar 22, 2016.
After 150 Years, ASA Says "NO" to p-values; Using Deep Q-Network to Learn Play Flappy Bird; Why we work so hard: The problems is overworked professionals are NOT miserable; Free Resources to Learn #MachineLearning.
- After 150 Years, the ASA Says No to p-values - Mar 15, 2016.
The ASA has recently taken a position against p-values. Read the overview and opinion of a well-respected statistician to gain additional insight.
- When Good Advice Goes Bad - Mar 14, 2016.
Consider these 4 examples of good statistical advice which, when misused, can go bad.
- OpenText Data Driven Digest Aug 21: College Majors, Hacking Glory, Innovation Performance - Aug 25, 2015.
The simple beauty of X-Y coordinates belies the power they hold; indeed, many of the best data visualizations created today rely on, and build upon, on the Cartesian plane concept to show complex data sets. Here are three examples.
- Big Idea To Avoid Overfitting: Reusable Holdout to Preserve Validity in Adaptive Data Analysis - Aug 17, 2015.
Big Data makes it all too easy find spurious "patterns" in data. A new approach helps avoid overfitting by using 2 key ideas: validation should not reveal any information about the holdout data, and adding of a small amount of noise to any validation result.
Pages: 1 2
- Overcoming Overfitting with the reusable holdout: Preserving validity in adaptive data analysis - Aug 12, 2015.
Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis.
- Top KDnuggets tweets, Jan 26-27: Sample Machine Learning solutions with R on Azure ML Marketplace - Jan 28, 2015.
Sample #MachineLearning solutions with R on #Azure ML Marketplace; xkcd explains P-Values: from Highly Significant to cr*p; Why you should learn R first for #datascience; Useful: 14 Data #Visualization Tools to Tell Better Stories.
- Year in Review: Top KDnuggets tweets in November - Dec 27, 2014.
P-values, the "gold standards" of statistical validity, are not as reliable; Nate Silver on 3 Keys to Great Information Design; Keep this #Python Cheat Sheet handy when learning to code; 8 Steps to Becoming a #DataScientist.
- Top KDnuggets tweets last week: P-values, the “gold standards” of statistical validity, are not as reliable - Nov 10, 2014.
P-values, the "gold standards" of statistical validity, are not as reliable; He Tweeted, She Tweeted: A Study on Romantic Breakups; A population density map of France from phone calls; Demystifying #DataScience.
- Top KDnuggets tweets, Apr 7-8: Beware of P values – they are not reliable; The Data Scientist Toolbox online course - Apr 9, 2014.
Data scientists beware: P values are not reliable; The Data Scientist Toolbox course - online at Coursera; Beyond the Science of Data Science; EU recommends changing copyright law to enable scientific text data mining.