GigaOM, By Derrick Harris, May 24, 2012
Academics are upset that companies like Google, Microsoft and Facebook have the keys to the underlying data, but they won't share. The issue may be a small war between universities and giant companies, but its part of the larger debate over consumer privacy versus the social good. As with many things, though, perhaps the answer lies in a grand compromise.
... It's tough to argue with privacy
The argument for openness is clear and powerful - access to source data is key to validating one's claims, even more so if data really has become so important as to be called the fourth paradigm of science - but the argument that releasing data would potentially violate consumer privacy is difficult to overcome if you're looking at the debate objectively. Google and Facebook, in particular, are locked into decrees with the Federal Trade Commission that have them subject to privacy audits for the next two decades. They're also constantly under the public microscope for privacy violations both real and perceived. Legally and publicly, they can't afford too many more gaffes.
... If you ask a privacy expert today, you'll likely hear that this situation is common thanks to our newfound proficiencies in analytics. It's one reason that rules about protecting users' personally identifiable information aren't as meaningful as they once were. During a recent conversation with NYU Stern School of Business professor Arun Sundararajan about intent-based privacy, he called anonymity a "gray area."
"Targeting has gotten to the point where firms can know who you are without knowing your name," he told me.
A world divided?
If there isn't some sort of compromise, it's conceivable we could see a divided research space with industry on one side and academia on the other, each playing by their own rules. In that scenario, industry wins. Markoff quotes a letter to the journal Nature from HP Labs social media director and industry-research critic that sums up this particular concern:
"If this trend continues," [Huberman] wrote, "we'll see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right 'connections' to private data."ACM SIGKDD Chair and ChoozOn CTO Usama Fayyad told me last year during a discussion at the International Conference on Knowledge Discovery and Data Mining that industry has already taken the lead in big data research for the same reasons Huberman cited as concerns in the social science realm.