Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News » 2015 » May » Opinions, Interviews, Reports » Essays On Statistics Denial

Essays On Statistics Denial

Statistics denial comes in waves as areas of application discover and rediscover the potential of data insights. We examine the statistics denial myths and where they come from.

By Randy Bartlett, Blue Sigma Analytics.

ufoUFO sightings, where do they all come from?  Astronomers spend an inordinate amount of time scouring the skies.  However, as far as we know, no credentialed astronomer has ever reported a UFO.  None, zilch.

Statistics denial myths, where do they all come from?  We have always had them, owing to an environment deeply rooted in deterministic thinking (no room for uncertainty) and low statistics literacy.  Most of us are uncomfortable with uncertainty and lack the training to deal with it.  Also, statistics denial comes in waves as areas of application discover and rediscover the potential of data insights.

The current wave of denying and mischaracterizing the role, breadth, and depth of applied statistics is associated with promotional hype around today's Big Data 'information rush.'  During the gold rush of 1849, it was the merchants, who made most of the money ... and none of them knew anything about mining for gold.  Among the most successful was Samuel Brannan, a tireless self-promoter, shopkeeper, and newspaper publisher, who purchased all the prospecting supplies available in San Francisco and re-sold them at a substantial profit.

Today's information rush is exemplified by the great promise of overflowing observational data, hyper communications, and the approaching Internet of Things.  The promotional hype initially comes from journals, self-glorifying books, and vendors, all with a certain perspective that is not informed by practice experience—publishers are unable to discern qualifications.  This creates misinformation stampedes with energized statistics deniers writing amplifying blogs, presentation decks, et al., which further mischaracterize and even adulterate statistics.  The downstream echoes talk everyone into believing their own hyped fabrications.  Two of the problems are that 1. Selling good statistics practice can be less lucrative than cutting some serious corners; and 2. Promoting services, workshops, data-analysis results, etc. is easier when not encumbered by competently wielding and accurately depicting statistics.

To harness the data, we need to follow best practices (Best Statistical Practice) for extracting and leveraging information—as characterized by Deming, et al. (See 'Out Of The Crisis').  We provide a complementary and more riveting, problem-based definition of statistics in the May/June 2015 issue of Analytics Magazine,  Even if many of us adhere to best practice, we need to be prepared for a coming flood of statistical malfeasance.  We should expect more follies (Google Flu Trends, fiber and colon cancer, Potti Gate, et al.) and more financial debacles (AIG, Fannie Mae, Moody's, Fitch Ratings, S&P Ratings, et al.).

ufo-sightingIt is our ambition to disrupt the 'message repetition,' which aims to establish these myths in the grand tradition of mass delusions and hysterias: Y2K, the dot-com new world order of 2000, Black-Scholes (won a Nobel), the housing bubble, witch burning, and UFO sightings.  We want you, gentle reader, to be wary of experts chosen by the loudness of their voices and of stories about complex technology expressed in two sentences.  We trust that people are smart; if they can find the information, they will not let stories about little green men get in their way of making a little green.

The 'information rush' is producing a sense of urgency; a great deal of opportunity; and spectacular breakthroughs coming from everywhere.  Meanwhile, the combination of low statistics literacy and overzealous promotional hype is facilitating dysfunctional data analysis, which is more detrimental than UFO sightings.

Mischaracterizations of statistics are a problem when they adulterate statistics or obstruct best practice (Best Statistical Practice).

We sure could use Deming, right now. Many of us, who consume or produce data analysis, hang out in the new LinkedIn group: About Data Analysis.

Bio: Randy Bartlett, PhD, CAP®, PSTAT® is a Statistician/Statistical Data Scientist; Author of 'A Practitioner’s Guide To Business Analytics'.


Sign Up

By subscribing you accept KDnuggets Privacy Policy