AI & Machine Learning Black Boxes: The Need for Transparency and Accountability

When something goes wrong, as it inevitably does, it can be a daunting task discovering the behavior that caused an event that is locked away inside a black box where discoverability is virtually impossible.

By Colin Lewis (Robotenomics) and Dagmar Monett (Berlin School of Economics and Law).

The black box in aviation, otherwise known as a flight data recorder, is an extremely secure device designed to provide researchers or investigators with highly factual information about any anomalies that may have led to incidents or mishaps during a flight.

The black box in Artificial Intelligence (AI) or Machine Learning programs1 has taken on the opposite meaning. The latest approach in Machine Learning, where there have been ‘important empirical successes,’2 is Deep Learning, yet there are significant concerns about transparency.

Developers acknowledge that the inner working of these ‘self-learning machines’ adds an additional layer of complexity and opaqueness concerning machine behavior. Once a Machine Learning algorithm is trained, it can be difficult to understand3 why it gives a particular response to a set of data inputs. This, as we describe below, can be a disadvantage when these algorithms are used in mission-critical tasks.

Furthermore, the fact that Machine Learning algorithms can act in ways unforeseen by their designer raises issues about the ‘autonomy,’ ‘decision-making,’ and ‘responsibility’ capacities of AI. When something goes wrong, as it inevitably does, it can be a daunting task discovering the behavior that caused an event that is locked away inside a black box where discoverability is virtually impossible.

Black box

As Machine Learning algorithms get smarter, they are also becoming more incomprehensible.

Machine Learning algorithms are essentially systems that learn patterns of behavior from collected data to support prediction and informed decision-making. These Machine Learning systems typically process data in two explicit areas as described by Rayid Ghani, Director of the Data Science for Social Good Fellowship, who indicated that4: “the power of data science is typically harnessed in a spectrum with the following two extremes:

  1. Helping humans in discovering new knowledge that can be used to inform decision making,
  2. Through automated predictive models that are plugged into operational systems and operate autonomously.”

To reflect these two extremes of knowledge gathering and automated decision-making, Machine Learning systems typically cluster into two types:

  • Type A applications -- in which model predictions are used to support consequential decisions that can have a profound effect on people’s lives such as medical diagnosis, loan applications, self-driving cars, and policing and prison sentencing; and
  • Type B applications -- in which model predictions are used in settings of lower consequence and large scale, such as which streaming video to watch, the news that gets shown at the top of a news-feed, and search queries.

However, it has been well documented5 that the design and build of these Machine Learning black boxes can lead to bias, unfairness, and discrimination through programmer and data choices. “The irony is that the more we design Artificial Intelligence technology that successfully mimics humans, the more that AI is learning in a way that we do, with all of our biases and limitations.”6

While there are other issues such as concerns about the quality of the data and its processing, or about the quality of the algorithms' outcome and its ethical implications, to name a few, managers should be aware of two core elements where potential problems frequently occur in Machine Learning systems and which we feel executives should be concerned with and take action to remedy: 1) Transparency, and 2) Leadership and Governance.

1. Transparency

For people to use the predictions of an analytics model in their decision-making, they must trust the model. To trust a model, they must understand how it makes its predictions, i.e., the model should be interpretable. Most current Machine Learning systems operating which are based on deep neural network principles are not easily interpretable.

This can potentially be very damaging for the organization that is relying on the AI system. Researchers Taylor et al7. (2016) have shown that “there are many possible hard-to-detect ways a system’s behavior could differ from the intended behavior of the designer, and at least some of these differences are undesirable.”

Monitoring the behavior of a Machine Learning system may prove difficult without careful design. Executives should strive to ensure that their Machine Learning systems are more transparent, in order to aid an informed overseer by allowing them to evaluate a system’s internal reasons for decisions. As Professor Pedro Domingos writes8: “when a new technology is as pervasive and game-changing as machine learning, it’s not wise to let it remain a black box. Opacity opens the door to error and misuse.”

2. Leadership and Governance

Leaders should seek to enforce strict governance over Machine Learning algorithms ensuring a “value alignment” and “good behavior’’ in these new machine intelligence systems especially as they are frequently being utilized in a general capacity for making effective decisions toward a business objective.

Governance of the systems should incorporate systematic ways to formalize hidden assumptions (inside a black box) and ensure accountability, auditability, and transparency of internal Machine Learning system workings. Furthermore, a greater emphasis on introducing stricter checks on the selection and robustness of open source Machine Learning algorithms and training data should be uppermost in developers and management's mind.

Any decision-making Machine Learning system optimizing itself for an objective, which may be misaligned with an organization’s interests, could have significant and permanent effects. Recognizing the limitations9 of Machine Learning and AI algorithms is the first step to managing them better. 

Ultimately we need to be sure we are not putting machines in charge of decisions that they do not have the intelligence to make. 


  1. The authors use the terms Machine Learning systems, programs, and algorithms interchangeably throughout this article.
  2. Bengio, Yoshua (2013). Deep Learning of Representations: Looking Forward. In A. Dediu, C. Martín-Vide, R. Mitkov, and B. Truthe (eds.), Statistical Language and Speech Processing: First International Conference (pp. 1-137), Berlin Heidelberg: Springer.
  3. Mittelstadt, Brent Daniel; Allo, Patrick; Taddeo, Mariarosaria; Wachter, Sandra; and Floridi, Luciano (2016, in press). The Ethics of Algorithms: Mapping the Debate. Big Data & Society.
  4. Shan, Carl (2015). How data science can be used for social good. Retrieved from
  5. Peña Gangadharan, Seeta; Eubanks, Virginia; and Barocas, Solon (2014). Data and Discrimination: Collected Essays. New America: Open Technology Institute. Retrieved from
  6. Programming and Prejudice: UTAH computer scientists discover how to find bias in algorithms (2015, August). UNews, University of UTAH. Retrieved from
  7. Taylor, Jessica; Yudkowsky, Eliezer; LaVictoire, Patrick; and Critch, Andrew (2016). Alignment for advanced machine learning systems. Retrieved from
  8. Domingos, Pedro (2016, May). Why you need to understand machine learning. World Economic Forum. Retrieved from
  9. Luca, Michael; Kleinberg, Jon; and Mullainathan, Sendhil (2016, January-February). Algorithms Need Managers, Too. Harvard Business Review. Retrieved from

Colin Lewis is a Behavioral Economist and Data Scientist who provides research and advisory services in automation, robotics and artificial intelligence ( His work on robotics and automation has been featured by The Financial Times, Bloomberg, Harvard Business Review, and others.

Dagmar Monett is Professor of Computer Science at the Berlin School of Economics and Law, Germany. She received a Dr. rer nat. in Computer Science from the Humboldt University of Berlin in 2005. Her main research and teaching interests include different areas in Artificial Intelligence and Software Engineering (