KDnuggets Home » News » 2011 » Mar » Publications » Link Prediction by De-anonymization: Winning Social Network Challenge  ( < Prev | 11:n07 | Next > )

Link Prediction by De-anonymization: Winning Social Network Challenge


 
  
de-anonymization can be used to game machine-learning contests-by simply "looking up" the attributes of de-anonymized users instead of predicting them.


Link Prediction by De-anonymization: How We Won the Kaggle Social Network Challenge

33bits.org, by Arvind Narayanan, Mar 9, 2011

The title of this post is also the title of a new paper of mine with Elaine Shi and Ben Rubinstein.

A brief de-anonymization history. As early as the first version of my Netflix de-anonymization paper with Vitaly Shmatikov back in 2006, a colleague suggested that de-anonymization can in fact be used to game machine-learning contests-by simply "looking up" the attributes of de-anonymized users instead of predicting them. We off-handedly threw in paragraph in our paper discussing this possibility, and a New Scientist writer seized on it as an angle for her article.

...
The Kaggle contest. Kaggle is a platform for machine learning competitions. They ran the IJCNN social network challenge to promote research on link prediction. The contest dataset was created by crawling an online social network-which was later revealed to be Flickr-and partitioning the obtained edge set into a large training set and a smaller test set of edges augmented with an equal number of fake edges. The challenge was to predict which edges were real and which were fake. Node identities in the released data were obfuscated.

Kaggle deanonymization There are many, many anonymized databases out there; I come across a new one every other week. I pick de-anonymization projects if it will advance the art significantly (yes, de-anonymization is still partly an art), or if it is fun. The Kaggle contest was a bit of both, and so when my collaborators invited me to join them, it was too juicy to pass up.

The Kaggle contest is actually much more suitable to game through de-anonymization than the Netflix Prize would have been.

Read more.


KDnuggets Home » News » 2011 » Mar » Publications » Link Prediction by De-anonymization: Winning Social Network Challenge  ( < Prev | 11:n07 | Next > )