Ethics of Big Data: avoid creepiness

New book, Ethics of Big Data, by a consultant and a doctor of philosophy, argues that as personal data becomes increasingly public, creators of big data will increasingly face ethical decision points.

... Facebook has a Data Science Team that is mining its stunningly large data trove to find societal insights, but insights to help its bottom line too. The Data Science Team, according to a recent article by Technology Review, has been able to determine a person's relationship status based on the type of songs he or she likes (breakups tend to elicit more ballad Likes), or the mood of, say, an entire country (Chile's gross national happiness flagged during a 2010 earthquake).

... In June the Wall Street Journal reported that Orbitz utilized data analytics to determine that Mac users will pay higher prices for hotel rooms than will their PC-using counterparts. ... Orbitz is not alone: organizations as diverse as Google News (and the way it filters the news) to landlords (who can sets rents by using revenue management software) have used analytics to manipulate customer segments, be that the news customers read or the lease agreements they sign.

Amidst all this data mining there is something else to consider: the creepiness factor.

According to the forthcoming book, Ethics of Big Data (O'Reilly Media, 2012), by former Capgemini principal consultant Kord Davis and doctor of philosophy Doug Patterson, as personal data becomes increasingly public, creators of big data will increasingly face ethical decision points.

"A lot of times technologists realize there is something particularly interesting that can be done with a new bit of correlating data - provide this new feature or service to customers. And product people get very excited about it; it has potential economic benefits," says Davis in a webcast called "An Introduction to Ethics of Big Data." "But then someone in the back of the room says, 'yes, but that's kind of creepy, right? Are you sure we should do that?'"

In this scenario individuals revert to their own moral code, according to Davis, since there is no common vocabulary or framework for the ethical use of big data.

Davis and Patterson's four question framework offers an important first step toward a common ground for discussing related issues:

Identity: Is offline existence identical to online existence?
Privacy: Who should control access to data? Davis points out that three data points can identify 87 percent of Americans: gender, birth date and zip code.
Ownership: Who owns data, can we transfer the rights of it, and what are the obligations of people who generate and use that data?
Reputation: What is important about reputation, says Davis, is the realization that the number of digital conversations and interactions that take place, and that we can participate in, fragments our ability to manage reputation.

Read more.