Data Science vs Crime: Detecting Pickpocket Suspects from Transit Records

A team of US and Chinese researchers has creatively used massive data collected by automated fare collectors for identifying thieves in the public transit systems. The system was tested in Beijing and was able to identify 93% of known pickpockets.

Editor: Recent KDD-2016 paper by a team of US and China researchers examined how to use the Big Data created by automated fare collectors to identify potential pickpockets. This work has received a lot of attention and was even covered in Economist . I invited one of the researchers to submit a blog to KDnuggets, and here it is.

By Hui Xiong, Rutgers.

Massive data collected by automated fare collection (AFC) systems provide opportunities for studying both personal traveling behaviors and collective mobility patterns in the urban area. In the paper presented at SIGKDD 2016, we creatively leveraged such data for identifying thieves in the public transit systems.

Indeed, stopping pickpockets in the public transit systems has been critical for improving passenger satisfaction and public safety. However, it is challenging to tell thieves from regular passengers in practice. The mobility patterns associated with the pickpocket suspects are typical abnormal, while ordinary passengers of the public transit systems can also make irregular journeys every day. Using data mining and machine learning methods, we developed a suspect detection system, which can identify pickpocket suspects based on their daily transit records.

Beijing Travel Time: Civilians vs Pickpockets

Specifically, we have two challenges: the first is to extract mobility features of all passengers and the second is to accurately detect the rare pickpocket suspects. In the suspect detection system, we first extract a number of features from each passenger's daily activities in the transit systems. Then, we take a two-step approach to improve the detection accuracy of the highly rare suspects. The two-step approach exploits the strengths of unsupervised outlier detection and supervised classification models to identify thieves.

Beijing Passenger Movements: All Passengers and Visitors

Beijing Passenger Movements: shoppers and thieves
Figure 11: Movement patterns of different type of passengers.
Figures from Catch Me If You Can: Detecting Pickpocket Suspects from Large-Scale Transit Records, KDD-2016.

Experimental results demonstrated the effectiveness of our approach. The system can identify 93% of known pickpockets. On average, there is at least one known pickpocket out of every 14 identified suspicious individuals. Based on the proposed detection approach, we also developed a prototype system with a user-friendly interface for the security personnel. Our system will be piloted in Beijing and subsequently in other Chinese cities.

Bio: Professor Hui Xiong is a Vice Chair in the Management Science and Information Systems Department, and the Director of Center for Information Assurance, at Rutgers State University. His general area of research is data and knowledge engineering, with a focus on developing effective and efficient data analysis techniques for emerging data intensive applications. He has received numerous awards, such as ICDM-2011 Best Research Paper Award, an IBM ESA Innovation Award (2008), and ACM Distinguished Scientist (2014). He is one the leading researchers in the field with over 200 publications, including 9 (!) papers in KDD-2016 conference.