Lets talk about Ethics in Analytics / Data Science

Is it time that data scientists go through formal ethics training? The saying “Lies, damned lies, and statistics” suggests that statistics (and Data Science) can be tweaked to prove any point and ethics training will help to improve the integrity and credibility of analytics profession.

By Bhasker Gupta, Analytics India Magazine.

Its time that data scientists go through formal ethics training

Lies, damned lies, and statistics

The saying goes like this, "Lies, damned lies, and statistics". There is no doubt that statistics can be tweaked any which way to prove a point. There is a fair bit of subjectivity available in the hands of an analyst using analytics.

There are various reasons that analytics offers a scope to be exploited in hands of its users:

  1. Analytics is not an exact science, just like statistics. Two analysts can work on a single analysis to come up with widely different results.
  2. Standardization has not yet seeped into this area. There can be dozen of ways to solve of single problem with various steps to reach to conclusion.
  3. Data Scientists have to make decisions almost at every steps of analytics process. No matter how technical or 'to-the-book' a modeling exercise might appear, there is a fair amount of human judgment & assessment that goes into creating an analysis.
Analytics will increasingly play a significant role in the integrated and global industries today, where individual decisions of analytics professionals may impact the decision making at the highest levels unimagined years ago. There's a substantial risk at hand in case of a wrong, misjudged model / analysis / statistics that can jeopardize the proper functioning of an organization.

Instruction, rules and supervisions are essential but that alone cannot prevent lapses. Given all this, it is imperative that Ethics should be deeply ingrained in the analytics curriculum today. I believe, that some of the tenets of this code of ethics and standards in analytics and data science should be:
  • These ethical benchmarks should be regardless of job title, cultural differences, or local laws.
  • Places integrity of analytics profession above own interests
  • Maintains governance & standards mechanism that data scientists adhere to
  • Maintain and develop professional competence
  • Top managers create a strong culture of analytics ethics at their firms, which must filter throughout their entire analytics organization

I don't propose the exact code of ethics for analytics professionals here; this article is merely a proposal that as an industry we require one. I regularly come across brilliant training programs being curated by industry professionals and veterans. But almost all miss any primer on code of governance, ethics, standards or even plain "how to document in analytics".

Formal instituted ethics increases the quality of a discipline. It increases reliance and transparency, and certifies progressive long-term results with everyone's benefits safeguarded. I believe its right time that we incorporate a code of standards in analytics that would act as a guiding star for data scientists during tough decision-makings.

Bio: Bhasker Gupta is a Data Science Evangelist, Mentor and Entrepreneur, currently heading Analytics India Magazine and Kruxonomy. He is based in Bengaluru, India.