BriefsPrivacy-preserving data mining via randomizationNew York Times (July 18, 2002) writes about work by IBM researchers Ramakrishnan Srikant and Rakesh Agrawal who created a data-mining program that masks individual truthful answers that consumers might enter once trust was established, yet still finds useful demographic information. The program automatically assigns random values--for example, a specified range of years to the age the consumer puts down--and then makes a series of mathematical guesses to recreate an accurate description of the distribution; the randomization of the initial data partially forms the basis of the guesses. Programs like this one could lead to greater truthfulness in the answers people volunteer on the Web provided that they were willing to replace some of their native caution with a bit of good will toward a company and its need for data-mining. see http://www.nytimes.com/2002/07/18/technology/circuits/18NEXT.html |
Copyright © 2002 KDnuggets. Subscribe to KDnuggets News!