- The Difficulty of Graph Anonymisation - Feb 25, 2021.
Lessons from network science and the difficulty of graph anonymization. A data scientist's take on the difficultly of striking a balance between privacy and utility in anonymizing connected data.
- Breaking Privacy in Federated Learning - Aug 26, 2020.
Despite the benefits of federated learning, there are still ways of breaching a user’s privacy, even without sharing private data. In this article, we’ll review some research papers that discuss how federated learning includes this vulnerability.
- How “Anonymous” is Anonymized Data? - Aug 18, 2020.
As the collection of personal data democratized over the previous century, the question of data anonymization started to rise. The regulations coming into effect around the world sealed the importance of the matter.
- Data Anonymization – History and Key Ideas - Oct 17, 2019.
While effective anonymization technology remains elusive, understanding the history of this challenge can guide data science practitioners to address these important concerns through ethical and responsible use of sensitive information.
- What Data You Analyzed – KDnuggets Poll Results and Trends - Apr 26, 2017.
Image/video data analysis is surging, JSON replacing XML, anonymized data usage is growing in US and Europe (but not in Asia), itemsets and Twitter analysis is declining - some of the highlights of KDnuggets Poll on data types used.
- NYC Taxi Hackathon – find privacy risks in public taxi datasets - Sep 19, 2016.
The NYC TLC has been a pioneer in sharing big data since 2010, but earlier data releases have been de-anonymized. TLC is considering releasing taxi data again, subject to a new anonymization method. This hackathon is to help test it.
- Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers - Jan 18, 2016.
Are you interested in massive amounts of data for research? Yahoo has just released the largest-ever machine learning dataset to the research community.
- Stanford Webinar: Big Data + Electronic Health Records = Better Healthcare, June 18 - May 15, 2015.
We show how to transform unstructured patient notes into a de-identified, temporally ordered, patient-feature matrix, and examine use-cases use the resulting data to improve learning of practice-based evidence in electronic medical records.
- WCAI Research Opportunity: Understanding, Expanding, and Predicting Customer Engagement - Sep 22, 2014.
A new consumer interaction dataset from an international beauty retailer will provide researchers with an opportunity to investigate the interaction between customers and brands. Register for the Oct 17 webinar, and submit proposals by Nov 3.