KDnuggets Home » News » 2016 » Sep » News, Features » NYC Taxi Hackathon – find privacy risks in public taxi datasets ( 16:n34 )

NYC Taxi Hackathon – find privacy risks in public taxi datasets


The NYC TLC has been a pioneer in sharing big data since 2010, but earlier data releases have been de-anonymized. TLC is considering releasing taxi data again, subject to a new anonymization method. This hackathon is to help test it.



By Jeff Garber, Director of Technology and Innovation, New York City Taxi and Limousine Commission (TLC)

New York City Taxi and Limousine Commission (TLC) Help TLC provide more public data!

The TLC has been a pioneer in sharing big data since 2010. With over 21,000 medallion taxis and street-hail liveries equipped to capture GPS-enabled trip records, TLC's trip data are valuable research tools for policy makers, scholars, businesses and urban planners.

Earlier data releases included anonymized vehicle and driver identifiers, but in 2014 they were de-anonymized and published. In response, we stopped releasing these identifiers altogether, which limits the data usefulness.

TLC is considering releasing this information, subject to a new anonymization method, which gives us the opportunity to invite civic data experts to help test its strength. In addition, we are interested in identifying potential privacy risks associated with geospatial data. The TLC is organizing a month-long online hackathon to accomplish these goals by challenging our civic hacker community.

For more information on the hackathon, including prizes and how to sign up, please click here.