Netflix cancelled the follow-up Netflix Prize
due to privacy concerns.
With so much personal information online, do you think that it is still possible for companies like Netflix to anonymize and release large datasets (for research and competitions)? [73 votes total]|
|Yes, anonymization is still possible - needs new approaches (41)||56%|
|No, it is no longer possible to fully anonymize large datasets (18)||25%|
|Not sure (7)||10%|
|Don't care about privacy or anonymization (7)||10%|
David Read, Privacy - To What Extent
I think a little more granularity is needed when considering privacy. I don't think that complete anonymity is important. For me the key concern is what information can be discovered. From the Netflix example, if people's health backgrounds were being uncovered that would be an issue. That they were found on IMDB doesn't seem interesting since the individuals weren't concerned with that aspect of their lives being anonymous - else why post in a non-anonymous fashion on IMDB?
Chris Clifton, Possible, but is it the right answer?
Asking is anonymization is possible is only part of the question. I would say it is possible (although not necessarily easy) - but it is still an open question if anonymized data retains sufficient utility.
I think it is important to determine what we want from the data, and what other approaches can be used to learn what we want rather than releasing anonymized data.