Follow Gregory Piatetsky, No. 1 on LinkedIn Top Voices in Data Science & Analytics

KDnuggets Home » News » 2018 » Apr » Opinions, Interviews » What Does GDPR Mean for Machine Learning? ( 18:n15 )

What Does GDPR Mean for Machine Learning?


This post investigates how the GDPR, which comes into force at the end of May, will effect machine learning.



By Nathan Sykes.

GDPR

When the General Data Protection Regulation, or GDPR, takes effect May 25, it will bring about sweeping changes to the regulations surrounding how major organizations collect data. The GDPR is the largest-scale modification to data privacy regulations to take place within the past two decades, and it exists to protect the rights of individual consumers.

The most pertinent question is how the GDPR will affect both machine learning and modern artificial intelligence—both of which are platforms that require constant access to large stores of data. Will it change the way enterprises use machine learning?

Unfortunately, there seems to be a lot of confusion from lawyers, scholars, analysts and regulators — and it’s warranted. The GDPR context is vague and unclear. According to some interpretations of the regulation, all citizens and parties have a “right to explanation” regarding machine learning models and algorithms. In short, it means if and when they are affected by data, they have a right to know how and why the model made a particular decision, or carried out a specific action.

This concept is extremely controversial. Many argue against everyone having this right, while others think it’s blatantly obvious and necessary.

Neural Networks, AI and Machine Learning Platforms Are Too Complex

Part of the reason for the conflict has to do with the complexity of machine learning platforms. It’s not a matter of choosing one action and spitting it out —many factors go into a single outcome, of which there are many conclusions to choose. Consider it like the branches of a tree. For each possible outcome, there are just as many branching options and factors to consider.

Will Knight from MIT Technology Review explains it best:

"[The reasoning of neural networks and machine learning platforms are] embedded in the behavior of thousands of simulated neurons, arranged into dozens or even hundreds of intricately interconnected layers.” “[Every time the system is accessed], outputs are fed, in a complex web, to the neurons in the next layer, and so on, until an overall output is produced."

Essentially, the system is designed to move from layer to layer until it reaches the appropriate output or outcome. And all this information would be difficult to explain and share with the average consumer, just to show why a system made a certain decision.

Of course, this puts the responsibility upon the shoulders of developers, system administrators, data analysts and network engineers. They’ll need to ensure the systems in place not only comply with the GDPR, but also can provide the relevant information to all affected parties.

How to Prepare?

Many companies are completely unprepared for the upcoming regulation. Ovum Research found 52 percent of business executives believe non-compliance to GDPR will result in fines for their company.

On a more basic level, there are several things you’ll need to do with all data your company collects. For starters, a clear understanding of data governance, along with a series of authentication and administrative protocols, are both absolutely necessary. You and your teams must understand what data is critical to business operations, who has access to it, how and why it changes and where it’s going.

You must also establish policies and rules that clearly define data access hierarchies, as well as implement and activate real-time monitoring tools. Robust data governance policies and procedures are the solution, but they’re not the only one. Data classification is the next hurdle—many will recognize it as a critical part of data governance—by simplifying the organization and processing of your information.

Believe it or not, custom machine learning systems can offset a lot of the work for these changes. That is, various tools can automate classification and data management, as opposed to expecting humans to do the work manually.

Of course, through the implementation of modern automation and hands-off organization process, you’ll need to be sure security is top-notch. To that end, you’ll want a robust security solution, along with a host of protection, preventive and reactive solutions. Without these, you’ll be looking at serious fines and repercussions when the GDPR comes into play.

Bio: Nathan Sykes is a business and technology freelancer and blogger from Pittsburgh, PA. To read his latest articles, check out his blog, Finding an Outlet.

Related:


Sign Up