John Tukey “Badmandments”
"Badmandments" from great statistician John Tukey: NEVER plan any analysis before seeing data; DONT consult with a statistician until after collecting data; LARGE enough samples always tell the truth.
By Gregory Piatetsky, Nov 3, 2013.
I saw this morning a tweet from @EdwardTufte mentioning John W. Tukey "badmandments", and wanted to share them with KDnuggets readers.
John W. Tukey, (1915 - 2000) was a great American mathematician and statistician, most known for developing FFT algorithm, box plot, and coining the term "bit".
He was also an eloquent writer, and he came up with "BADMANDMENTS" (for non-English speakers - it is a pun, and opposite of the word COMMANDMENT).
Follow these "Badmandments" rigorously, and wrong scientific results will be assured.
These badmandents are part of The Collected Works of John W. Tukey: Philosophy and Principles, Volume 3, starting on page 197.
Many of these "badmandments" are just as relevant to today's data scientists, and here are some for your enjoyment and not following them:
The Great Badmandment restated:
ONLY THREE ACTIONS IN SCIENCE ARE SAFE: TO BE GUIDED BY THEORY, any theory; TO BE SIMPLE, and to do NOTHING, absolutely nothing
1. THERE IS NO ANALYSIS LIKE UNTO CROSS-TABULATION
2. BE EXACTLY WRONG, RATHER THAN APPROXIMATELY RIGHT
3. THE ONE AND ONLY PROPER USE OF STATISTICS IS FOR SANCTIFICATION (we used statistics, our work above criticism!)
4. BEWARE EMPIRICISM, IT ISN'T SCIENTIFIC
5. AT ALL COSTS BE RIGIT AND SERIOUS; FOLLOW THE STRAIGHT AND NARROW WAY TO ITS INEVITABLE END
- 91. NEVER plan any analysis before seeing the DATA.
- 92. DON'T consult with a statistician until after collecting your data
- 94. LARGE enough samples always tell the truth (I wonder what Tukey would have thought of Big Data?)
- 96. NEVER try to find out if your population is meaningfully divided into two or more subpopulations
- 97. ANY one regression (model) will tell you what you want to know, don't even think of looking at MORE.
- 100. The significance level tells you the probability that your result is WRONG.
You can see more on Google Books: john tukey badmandments.