The Silicon Jungle is a fictionalized account of data mining and machine learning in today's largest internet companies. It is written for a general audience and is easily accessible for a non-specialist. Though it provides a gentle introduction to data mining, the lessons are on data safety, scientific responsibility, and how data can be constructively used as well as misused when not handled carefully.
If you, or faculty in your department, may be interested in using this book for undergraduate courses (typical courses will use novels from Sinclair, Orwell, Huxley, etc) or simply as a "what-not-to-do-with-user-data" manual, please contact me - course materials will follow based on interest.
Shumeet Baluja, shumeet at google.com
p.s. because of my current and previous affiliations, I must emphasize that though the lessons are real, the events are not.
The Silicon Jungle:
A Novel of Deception,
Power, and Internet Intrigue
Princeton University Press (April 2011)
What happens when a naive intern is granted unfettered access to people's most private thoughts and actions? Young Stephen Thorpe lands a coveted internship at Ubatoo, an Internet empire that provides its users with popular online services, from a search engine and shopping to e-mail and social networking. When Stephen's boss asks him to work on a project with the American Coalition for Civil Liberties, Stephen innocently obliges, believing he is mining Ubatoo's vast databases to protect the ever-growing number of people unfairly targeted in the name of national security.
But nothing is as it seems. Suspicious individuals--do-gooders, voyeurs, government agents, and radicals--surface, doing all they can to access the mass of desires and vulnerabilities gleaned from scouring Ubatoo's wealth of intimate information. Entry into Ubatoo's vaults of personal data need not require technical wizardry--simply knowing how to manipulate a well-intentioned intern may be enough.
Set in today's cutting-edge data mining industry, The Silicon Jungle is a cautionary tale of data mining's promise and peril, and how others can use our online activities for political and personal gain just as easily as for marketing and humanitarian purposes. A timely thriller, The Silicon Jungle raises serious ethical questions about today's technological innovations and how our most confidential activities and minute details can be routinely pieced together into rich profiles that reveal our habits, goals, and secret desires--all ready to be exploited in ways beyond our wildest imaginations.
- "A cerebral, cautionary tale. Credible and scary." Vint Cerf, Google Vice President and Chief Internet Evangelist and one of the "Fathers of the Internet"
- "Clever and prophetic. The Silicon Jungle will be required reading from Silicon Valley to Washington, DC." Marc Rotenberg, Electronic Privacy Information Center
- "This novel will open your eyes to issues of privacy on the internet and to the hazards of placing uncritical, blind trust in the people overseeing this vast enterprise. Baluja tells a story about something that could happen to any of us - if you're even modestly concerned about information privacy, this is an important book to read." Roy Maxion, Carnegie Mellon University
Shumeet Baluja is a senior staff research scientist at Google, focusing on data mining, statistical machine learning, and computer vision. He was formerly the chief technology officer of Jamdat Mobile and chief scientist at Lycos. He holds a PhD in computer science from Carnegie Mellon University and has served as an adjunct faculty member in both the Computer Science Department and the Robotics Institute at CMU.