This humorous clip, Biologist talks to statistician, featured in the newsletter of the American Statistical Association, Amstat News, takes aim at the statistical ignorance of scientists. Statistics Done Wrong (Reinhart) is a pithy book which digs more deeply into confusion about statistics.
Uncertainty: The Soul of Modeling, Probability & Statistics (Briggs) is hard-hitting and takes on both scientists and statisticians. Vital Statistics You Never Learned...Because They're Never Taught is a short interview with highly-regarded statistician Frank Harrell, author of the influential book Regression Modeling Strategies. I would urge any marketing researcher, statistician or data scientist to take a few minutes to read this interview.
Many of the statistics textbooks I have read also give examples of all-too-common statistical malpractice. There is also management science...The Halo Effect (Rosenzweig) is a highly informative and non-technical book that shreds many of the books cited as bibles by business gurus. The author exposes flawed reasoning and rudimentary statistical errors made by the authors of these bibles.
Knowing math or being a skillful programmer by no means guarantees a good understanding of statistics. They are different skill sets. There are also the sad cases of fraud, and a good scientist with a solid grasp of statistics is not certain to behave ethically. Egos, ambition, reputations and funding - including government funding - are sources of dishonesty. How To Lie With Numbers exposes some of the tricks used to deceive us, and was inspired in part by the conduct of scientists.
Like many animal species, humans organize themselves into hierarchies, and deference to authority is part of our make-up. However, scientists are also human and ignorance and misunderstandings about statistics are not hard to find in the scientific community, nor is shoddy research or questionable ethics. Here are some examples:
- Scientists can be very sloppy with data...misreading it, mislabeling it, failing to clean it properly, merging it incorrectly, failing to archive it...it's a pretty long list.
- Poor understanding of probability theory and over-reliance on a handful of probability distributions, such as the normal.
- Ignorance of sampling theory and sampling methods. Generalizing from a small self-selected sample to a large heterogeneous population is one example. Misunderstandings about weighting data are also common. Treating data as if they were a simple random sample when a complex sampling method had been used is a third example.
- Poor grasp of inferential statistics, e.g., confusing statistical significance with practical significance. Another example is conducting significance testing on population data. For instance, if we have 50 years of quarterly GDP data for Country A, these 200 data points are the population data for that time period and country, not a sample from a population. Conducting a t-test to see whether a linear trend is statistically different from zero, for example, would be meaningless in this case.
- Capitalizing on chance, for example, by hunting for significant differences (p-hacking) and not adjusting for the number of significance tests that have been conducted.
- Achieving statistical significance is typically necessary for papers to be accepted by academic journals, and a serious consequence of this is publication bias. Introduction to Meta-Analysis (Borenstein et al.) and Methods of Meta-Analysis (Schmidt and Hunter) were eye-openers for me.
- Making dramatic conclusions from a single study that has not been replicated or even cross-validated. Small-scale qualitative studies in which statistics cannot be used at all still grab headlines.
- Superficial understanding of Bayesian statistics, nonparametric statistics, psychometrics, and latent variable models.
- Poor understanding of methods for analyzing time-series and longitudinal data, as well as spatial statistics and multi-level and mixed models.
- Many tools such as Support Vector Machines and Artificial Neural Networks, and concepts such as boosting and bagging, prevalent in data mining and predictive analytics, have potential utility to many fields of science that remains untapped.
- Not paying enough attention to the right-hand side of Generalized Linear Model equations, for example, by neglecting interaction terms.
- Imposing linearity on a model when quantile regression, regression splines, generalized additive models or other methods would be more sensible for a particular set of data.
- Not understanding (or conveniently ignoring) important statistical assumptions. Regression analysis has been so universally abused that I feel it deserves a public holiday.
- Ignoring measurement error. Statistical significance tests assume no measurement error and, what's more worrying, measurement error can throw off our interpretations of statistical models.
- Neglecting regression to the mean...a very old and very dangerous mistake!
- Categorizing continuous variables, which is done out of various motivations. One is a misguided attempt to meet statistical assumptions. Another is that it is a way to cook results. For example, some "effects" are really proxies for age or are heavily moderated by age. Continuous age is sometimes intentionally grouped into wide age ranges so that its effect is diminished. In this way, an unscrupulous researcher can say the effect they're attempting to establish was "significant even after controlling for age."
- HARKing (Hypothesizing After the Results are Known). Translation: Coming up with hypotheses after having had a long peek at the data. This appears so widespread that it falls into the "Everyone does it, so it's OK" category.
- Cherry-picking data, or subsets of data, that support a hypothesis does not appear to be rare, nor does modeling data until it "confesses" that a hypothesis is right. "Adjusting" data until it supports a hypothesis also is not unheard of. There is now more data and more ways to cook it than ever.
- Confusing cause with effect, and misunderstandings about causal mechanisms in general are not unusual. See this interview with Harvard professor Tyler VanderWeele for a snapshot of causal analysis, which I feel is the next frontier in analytics.
- Hiding behind peer review. There are thousands of "academic" publications, but few journal reviewers are statisticians. Peer review is sometimes really chum review in disguise.
- Treating simulated data as if they were actual data, and interpreting computer simulations as if they were experiments that used real data.
- Calling a rough estimate based more on assumptions than data a "finding." This is so common in the academic literature that we seldom notice it. Stochastic models are sometimes misinterpreted as deterministic models, as well.
- Trying to squeeze blood from a stone...sparse data are sparse, and the less informative the data, the more the researcher must "fill in the blanks." There are many (often complex) ways to work with sparse data, but all increase the amount of subjectivity that enters the modeling process. This, in turn, provides additional leeway for unethical scientists.
- Questionable use of meta-analysis and propensity score analysis. These are long topics!
- Providing little information regarding overall model adequacy. For instance, we may be informed that variables X2, X5 and X8 were “highly statistically significant,” in accordance with the author’s hypothesis. However, we notice that the sample size was large and the effect sizes of these variables were small. What was the model adjusted R square, for example? What about model assumptions?
- Not keeping up with recent development in statistics and not interacting with professional statisticians. This is a root cause of many of the problems listed above.
Setting ethical issues aside, how can sloppy research happen in science? Though this may surprise you, most disciplines only require minimal coursework in research methods and statistics, even at the doctoral level, because of the sheer number of courses on the subject matter itself that are required. Given that at least some of this subject matter rests on shaky statistical ground, I find this ironic.
Another cause for concern is that courses on statistics are frequently taught by non-statisticians, and one consequence is that idiosyncratic varieties of bad habits are passed on from generation to generation within some disciplines. Even in the Internet Age, there is considerable inbreeding in many disciplines, as noted earlier. Academic statisticians are often unaware of these errors or not in a position to respond when they are aware. This is not a purist's complaint - in some fields, human life is at stake.
However, to be fair, statistics is a difficult subject to teach and is often taught badly. Coursework tends to be heavily concentrated on theory and mechanics, and a great deal of material must be covered in a short space of time. Students often are not shown how to apply statistics to real-world problems and some flounder once they graduate.
All this said, it's important to recognize that good science is tough and designing and conducting sound research challenging. Your religious beliefs are your personal business, but I would urge you not to worship scientists...or statisticians. Humans are human.
Original. Reposted with permission.
- Vital Statistics You Never Learned… Because They’re Never Taught
- Statistical Modeling: A Primer
- Time Series Analysis: A Primer