You Might Already Know This

This paper inflamed one of the longest-running debates in science. For decades, some statisticians have argued that the standard technique used to analyze data in much of social science and medicine overstates many study findings - often by a lot.

New York Times, By BENEDICT CAREY, January 10, 2011

ESP They should have seen it coming. In recent weeks, editors at a respected psychology journal have been taking heat from fellow scientists for deciding to accept a research report that claims to show the existence of extrasensory perception.

The report, to be published this year in The Journal of Personality and Social Psychology, is not likely to change many minds. And the scientific critiques of the research methods and data analysis of its author, Daryl J. Bem (and the peer reviewers who urged that his paper be accepted), are not winning over many hearts.

Yet the episode has inflamed one of the longest-running debates in science. For decades, some statisticians have argued that the standard technique used to analyze data in much of social science and medicine overstates many study findings - often by a lot. As a result, these experts say, the literature is littered with positive findings that do not pan out: "effective" therapies that are no better than a placebo; slight biases that do not affect behavior; brain-imaging correlations that are meaningless.

By incorporating statistical techniques that are now widely used in other sciences - genetics, economic modeling, even wildlife monitoring - social scientists can correct for such problems, saving themselves (and, ahem, science reporters) time, effort and embarrassment.

The statistical approach that has dominated the social sciences for almost a century is called significance testing. The idea is straightforward. A finding from any well-designed study - say, a correlation between a personality trait and the risk of depression - is considered "significant" if its probability of occurring by chance is less than 5 percent.

This arbitrary cutoff makes sense when the effect being studied is a large one - for example, when measuring the so-called Stroop effect.

Read more.