KDnuggets : News : 2008 : n12 : item25 < PREVIOUS | NEXT >

Publications

From: Vincent Granville
Date: 22 Jun 2008
Subject: Data Mining Techniques Unearth Major Click Fraud Botnet

Using design of experiments techniques and statistical data mining, I identified a botnet generating more than 10 million dollars in click fraud revenue annually. I discovered and understood the mechanism used by the botnet to generate click fraud. I also discovered that with very little additional efforts and intelligence, they could potentially have generated 100 million in fraud revenue and at the same time be considerably more difficult to detect.

http://www.analyticbridge.com/group/successstories

The botnet - a low frequency botnet hitting advertisers no more than twice per day to avoid detection, was initially discovered in a small dataset that was used for testing purposes. The dataset in question was part of a design of experiments.

Oultlier detection techniques applied to million of multivariate and compound metrics found that the click-to-conversion ratio was consistently outside a very conservative confidence interval. This initiated an investigation where many other abnormalities were found:

  • very low variance
  • very short visits
  • inability to generate fake user agents due to the technology used by the botnet
  • targeting 50% of all advertisers, working with keyword lists
  • relatively good IP distribution, with some government IPs over-represented
  • poor user agent distribution
  • triggering CSS, JS, GIF and JPEG HTTP requests (behaving a little bit like a real human)
  • generating a very large volume of bogus conversions
  • erroneously identified by Alexa and other web analytics companies as real users
  • associated with the largest search engine
  • discovered before Google engineers found it (I am not sure if they ever found it)
  • clicks generated by the botnet were charged as good clicks
After understanding the mechanisms at play, I was able to identify additional botnets associated with other search engines.

Vincent Granville, Ph.D.
Founder and Principal
www.datashaping.com
www.analyticbridge.com

Bookmark using any bookmark manager!


KDnuggets : News : 2008 : n12 : item25 < PREVIOUS | NEXT >

Copyright © 2008 KDnuggets.   Subscribe to KDnuggets News!