arXiv.org and the 24 Hour Research Cycle
ArXiv.org gives researchers the ability to instantly publish research, free of peer review and the publication cycle. This capability offers both advantages and pitfalls. We should warily eye the 24-7 news cycle as a cautionary tale for how this could go wrong.
Historically, academic articles are published in peer-reviewed journals. Albeit imperfect, peer-review benefits both the authors, the readers, and eventually the public at large. Authors benefit from critical feedback from a knowledgeable cadre of world experts. Reviewers catch errors in writing and analysis, providing valuable feedback and guarding against future embarrassment. Readers of the primary literature benefit from higher quality papers. And the public benefits indirectly because a healthy peer-review process provides some, albeit insufficient, restrictions on what the mainstream press can represent as scientific knowledge.1
Unfortunately, scientific journals are slow. Long cycles of review and rebuttal can wedge over a year between any paper's submission and its publication. As a result, fast-moving fields like computer science have gravitated towards peer-reviewed conference proceedings as the preferred mode of publication. Conferences tend to offer faster response times and binary outcomes (accept/reject) rather than lengthy revision cycles. However, in a fast-moving field like machine learning, even the months it make take to publish a work in a conference, assuming the paper is accepted, represents far too long a wait. Over the course of months or a year, other researchers may independently invent or steal an idea. Competing work may diminish an idea's relevance. A less paranoid justification for more expedient publication is that in some fields ideas may be immediately useful to other researchers. Thus any delay can confer negative consequences on the research community.
As a result, the computer science community increasingly embraces arXiv.org, an internet-based pre-print developed in 1991 for sharing physics papers. Increasingly mainstream, arXiv.org is used heavily by mathematicians, physicists, computer scientists, astronomers, and statisticians, among others, who post papers in LaTeX or PDF formats in advance of publication. The site is immensely useful, especially for facilitating collaboration across large research communities in fast-developing fields. I personally have relied on it, both for publishing and accessing papers. In addition to facilitating rapid, long-range collaboration, arXiv provides the considerable benefit of establishing the precedence of ideas. Another considerable benefit is that arXiv is free to the world, opening state-of-the-art scientific research to the world beyond the pay-walls of the academic press.
But these benefits come with considerable pitfalls. Absent peer-review and rushing to establish precedence, authors publish work that may be incomplete, incorrect, and unfiltered by the critical eye of a reviewer. Audiences receive papers sooner, but perhaps with lower expectation of quality. Perhaps most alarming, the popular media, including reputed publications like the MIT Tech Review, applies the same naive "if it's in a paper, it must be true" approach for arXiv papers as they do for peer-reviewed publications.
24-hour News Cycle: A Cautionary Tale
The online and television news media supplies stories because they are demanded, because they will be clicked on, not because the content is true or has intellectual depth. Regardless of what journalism schools teach, the economics of online and television programming incentivize quality reporting only insofar as consumers are more likely to click and watch. Even reputable sites pepper content with links from outbrain directing users to articles about wardrobe malfunctions, worthless financial advice, and miracle drugs. A search for positive or negative news on Bitcoin's value or Tesla's stock price will always return a plethora of articles. This is not because anything worth discussing has taken place but because it is an easy recipe for boosting clicks.
ArXiv.org, for all its usefulness, exposes a tremendous weakness in our ability to digest the scientific literature and translate it for the public. ArXiv.org didn't create this weakness. We've always struggled to differentiate for the public which research is trustworthy and which should be take with a grain of salt. The capricious views of nutritionists on the healthfulness of red wine are places alongside verified mathematical theorems in the category of "a study from ____ shows that _____ is true". Still, the move to open-access, non-reviewed outlets, requires that we rapidly develop stronger defenses against misinformation.
MIT Tech Review Misuses the University's Imprimatur
On May 29th, an article was published to the arXiv.org pre-print claiming to surpass human performance on the IQ test. Two weeks later, the MIT Technology Review amplified the findings in an uncredited blog post. The Tech Review did not bother to attribute the article to an author, and yet the nameless writer confidently assessed the work as "impressive".
Once the imprimatur of MIT was bestowed on the story, it rapidly spread. A Google search for "deep learning IQ" reveals many thousands of articles parroting the MIT Tech Review article. Most articles that show on the first page of search results copy the title, "Deep Learning Machine Beats Humans in IQ Test", verbatim. It seems most sites bothered neither to read the paper nor even to feign an original interpretation.
I did read the source paper. Afterwards, I was convinced that few writing about it had. The human baseline for comparison consisted of Amazon Mechanical Turk workers, paid to answer questions as quickly as possible. On some tasks, they claim humans with advanced graduate degrees (levels of education self-reported by Mechanical Turk workers) barely perform better than random. Further, the paper contained a number of sensational claims, including "we could be a further step closer to the true human intelligence". This is in reference to a highly specialized system for identifying synonyms and antonyms, and solving analogies. This sensationalism might likely have been pruned by a functioning peer review process. And it seems likely a competent reviewer would demand a stronger validation. On KDnuggets, I published a critical assessment. And while the article does show up second on the aforementioned query, the 24-7 news cycle has moved on, and for most part, the narrative is fixed. Intelligent computers have eradicated the IQ test, some ads have been clicked, and society moves on to the next screaming headline.
In Pursuit of a More Robust Process
Anyone can publish a document that looks strongly like a paper to the arXiv. I could post a 10 page paper to arXiv tomorrow claiming to prove that P=NP, so long as the LaTeX compiled and no human in the loop spotted the incredibly tall claim. Most likely it would post to arXiv by the end of the week. In many ways this is a beautiful thing. If I were correct in my claim, even if I were devoid of any means, position, or institutional affiliation, my findings would the reach the relevant audience. But if the paper were rubbish, I should hope it would not be trumpeted from the mountaintops.
Newspapers do not take the words of politicians at face value. When Mitt Romney says he can simultaneously cut taxes and balance the budget, journalists pounce. And when corporations insist that their data-gathering practices pose no threat to consumer privacy, journalists balk. Yet for some reason, our dialogue on science is badly broken. Papers are trusted implicitly, likely not even read. Abstracts are picked over for the most sensational claims and these are trumpeted uncritically, even by respected publications like the MIT Tech Review. Research papers are treated as religious documents, brought down from the heavens, not critically picked over by knowledgeable writers. Of course, there exists a paucity of qualified science writers. But this is no justification for the current state of affairs. Worse, the eruptions of attention bestowed upon sensational work encourages scientists to publish sensational claims rather than factual ones.
Even in a world where all papers are peer-reviewed, our conversations about science are thin and balanced precariously on trust and fragile assumptions. But in a world of unreviewed, open-access science, we need better filters. ArXiv provides tremendous upside. Research can collaborate on a weekly pace, not yearly. But it also provides a challenge to similarly innovate the way we review content, share it among scientists, and translate it for public consumption.
1The badly broken relationship between the mainstream press and primary scientific literature deserves a post all to itself. But such a conversation is outside the immediate scope of this post.Zachary Chase Lipton is a PhD student in the Computer Science Engineering department at the University of California, San Diego. Funded by the Division of Biomedical Informatics, he is interested in both theoretical foundations and applications of machine learning. In addition to his work at UCSD, he has interned at Microsoft Research Labs.
- Deep Learning and the Triumph of Empiricism
- Not So Fast: Questioning Deep Learning IQ Results
- The Myth of Model Interpretability
- (Deep Learning’s Deep Flaws)’s Deep Flaws
- Data Science’s Most Used, Confused, and Abused Jargon
- Differential Privacy: How to make Privacy and Data Mining Compatible