The Foundations of Algorithmic Bias

We might hope that algorithmic decision making would be free of biases. But increasingly, the public is starting to realize that machine learning systems can exhibit these same biases and more. In this post, we look at precisely how that happens.


Thus far we’ve considered only the ways that bias can infiltrate algorithms via datasets. But this isn’t the only way that ethically dubious behavior enters algorithmic decision-making. Another source of trouble can be the choice of objective: What do we choose to predict? And how do we act upon that information?

Consider, for example, the the recommender systems that services like Facebook rely upon to suggest news items from around the world on social media. Abstractly we might state the goal of such a recommender systems is to surface articles that keep users informed of important events. We might hope that surfaced articles would be truthful. And in an election year, we might hope that different candidates would have equal opportunities to get messages across.

In short, these are the responsibilities we normatively expect humans to take when they make curatorial decisions as with major television stations and newspapers. But this isn’t how the algorithms behind real-life recommender systems on the internet work today. Typically, they don’t know or care about truth and they don’t know about neutrality.

That’s not necessarily because internet giants dislike these virtues – it’s often simply because it’s hard. Where can we find examples of millions of articles scored according to journalistic quality or truth content as assessed by impartial fact-checks? Moreover, ensuring neutrality requires that we not only rank individual articles (and deliver the top ones) but that we rank sets of recommended articles according to their diversity, a considerably harder optimization problem.

Moreover, solving hard problems can be extremely expensive. Google, Amazon, and Facebook have invested billions of dollars in providing machine learning services at scale. And these services typically optimize very simple goals . Solving a yet harder problem with potentially little prospect for additional remuneration cuts against the financial incentives of a large company.

So what do machine learning practitioners typically optimize instead? Clicks. The operating assumption is that people are generally more likely to click on better articles and less likely to click on worse articles. Further, it’s easy for sites like Facebook, Google and Amazon to log every link that you click on. This passively collected click data can then be used as supervision to the machine learning algorithms trained to optimize search results. In the end people see more articles that they are likely to click on. The hope would be that this corresponds closely to what we really care about – that the articles are interesting, or of high quality. But it’s not hard to imagine how these goals might diverge. For example, sensational headlines might be more likely to get clicks even if they’re less likely to point to true stories.

This is common in machine learning. Sometimes the real problem is difficult to define, or we don’t have any solid data. So instead we optimize a surrogate problem, hoping that the solutions are similar enough. And indeed many services, like Google search, despite it’s shortcomings, turns up far more relevant results than purely random or chronological selection from the web at large.

But the success of these systems, and our consequent dependence on them, also makes their shortcomings more problematic. After all, no one would be worried about FaceBook’s curation of the news if no one received their news from the site.

To see how things could go wrong, we might take a look at the current presidential election. On conventional media like radio and TV, broadcast licensees are required to give equal time to opposing presidential candidates if they request it. That is, even if one candidate might seem more entertaining, or procure higher ratings, we believe that it biases elections for one candidate to receive significantly more coverage than another.

While the adherence of conventional media outlets to this principle might be debatable, it seems clear that denizens of social media were treated to a disproportionate deluge of articles about Donald Trump. While these articles may have truly been more likely to elicit clicks, the overall curated content lacked the diversity we would expect from conventional election coverage.

Of course, FaceBook’s news curation is thorny issue. On one hand Facebook has a role in curating the news, even if it doesn’t fully embrace its role news organization. On the other hand, Facebook also functions as a public square, a place where people go to speak out loud and be heard. In that context, we wouldn’t expect any enforcement of equal time, nor would we expect all messages to be given equal chance to be heard by all in earshot.  But, as we all know, Facebook doesn’t simply pass on all information on equally, so it isn’t quite a public square either.

It can be hard to anticipate the effects of optimizing these surrogate tasks. Rich Caruana, a researcher at Microsoft Research Redmond presented a compelling case where a predictive machine learning model is trained to predict risk of death in pneumonia patients. The model ended up learning that patients who also had asthma as comorbid condition were given a better probability of survival.

You might wonder why the model reached such a counterintuitive conclusion. The model didn’t make an error. Asthma was indeed predictive of survival, this was a true association in the training data.

However, the relationship is not causal. The asthma patients were more likely to survive because they had been treated more aggressively. Thus there’s often an obvious mismatch between the problem we want to solve and the one on which we actually train our algorithms.

We train the model to classify asthma risk, assuming nothing changes. But then we operate on the hypothesis that these classifications are causal relationships. Then, when we act based on this hypothesis  to intervene in the world, we invalidate the basic assumptions of the predictive model.

As I articulated in a recent paper [7], it’s in precisely these situations, where real and optimized objectives disagree, that we suddenly become very interested interpreting models, that is, figuring out how precisely they make decisions. Say, for example, that we want to classify tumors as malignant or benign, and that we have perfectly curated training data. If our algorithm achieves 100% accuracy, then it may not be essential to understand how precisely it makes its decisions. Absent transparency, this algorithm would still save lives. But when our real-world goals and the optimized machine learning objectives diverge, things change.

Take Facebook’s newsfeed as an example. Their real-world goal may be to present a personalized and useful stream of curated content. But likely, the machine learning goal is simply to maximize clicks and/or other superficial measures of engagement.  It’s not hard to see how these goals might diverge. A story can grab lots of clicks by offering a sensational headline but point to fake news. In that case, the story might be clicky but not useful. Moreover this sort of divergence may be inevitable. There are many situations where for various reasons it might be impossible to optimize real objectives directly. They may be too complex, or there might be no available annotations. In these situations it seems important to have some way of questioning models, either by introspecting them or analyzing their behavior.

In a recent conversation, Rich Caruana suggested a silver lining. These problems may be worse now precisely because machine learning has become so powerful. Take search engines for example. When search engines were predicting total garbage, the salient question wasn’t whether we should be following click signal or a more meaningful objective. We simply wondered whether we could make systems that behave comprehensibly at all.

But now that the technology is maturing, the gap between real and surrogate objectives is more pronounced. Consider a spacecraft coming from another galaxy and aiming for earth but pointed (incorrectly) at the Sun. The flaw in its trajectory might only become apparent as the spacecraft entered the solar system. But eventually, as the craft drew closer to the sun, the difference in trajectory would become more pronounced. At some point it might even point in the exact opposite direction.


So far we’ve punted on a precise definition of bias. We’ve relied instead on some exemplar cases that seem to fall under a mainstream consensus of egregiously biased behavior.  And in some sense, we use machine learning precisely because we want to make individualized decisions. In the case of loan approval, for example, that necessarily means that the algorithm advantages some users and disadvantages others.

So what does it mean for an algorithm to be fair? One sense of fairness might be that the algorithm doesn’t take into account certain protected information, such as race or gender. Another sense of fairness might be that the algorithm is similarly accurate for different groups. Another notion of fairness might be that the algorithm is calibrated for all groups. In other words, it doesn’t overestimate or underestimate the risk for any group. Interestingly, any approach  that hopes to guarantee this property, might have to look at the protected information. So there are clearly some cases in which ensuring one notion of fairness might come at the expense of another.

In a recent paper, Professor Jon Kleinberg gave an impossibility theorem for fairness in determining risk scores. He shows that three intuitive notions of fairness are not reconcilable except in unrealistically constrained cases [8]. So it might not be enough simply to demand that algorithms be fair. We may need to think critically about each problem and determine which notion of fairness is most relevant.


Many of the problems with bias in algorithms are similar to problems with bias in humans. Some articles suggest that we can detect our own biases and therefore correct for them, while for machine learning we cannot.  But this seems far-fetched. We have little idea how the brain works. And ample studies show that humans are flagrantly biased in college admissions, employment decisions, dating behavior, and more. Moreover, we typically detect biases in human behavior post-hoc by evaluating human behavior, not through an a priori examination of the processes by which we think.

Perhaps the most salient difference between human and algorithmic bias may be that with human decisions, we expect bias. Take for example, the well-documented racial biases among employers, less likely to call back  workers with more more typically black names than those with white names but identical resumes.  We detect these biases because we suspect that they exist and have decided that they are undesirable, and therefore vigilantly test for their existence.

As algorithmic decision-making slowly moves from simple rule-based systems towards more complex, human-level decision making, it’s only reasonable to expect that these decisions are susceptible to bias. Perhaps, by treating this bias as a property of the decision itself and not focusing overly on the algorithm that made it, we can bring to bear the same tools and institutions that have helped to strengthen ethics and equality in the workplace, college admissions etc. over the past century.


Thanks to Tobin Chodos, Dave Schneider, Victoria Krakovna, Chet Lipton, and Zeynep Tufekci for constructive feedback in preparing this draft.


  1. Byrnes, Nanette, Why we Should Expect Algorithms to be Biased2016
  2. Naughton, John Even Algorithms are Biased Against Black Men 2016
  3. Tufekci, Zeynep, The Real Bias Built in at Facebook New York Times 2016
  4. Angqin, Julia et al., Machine Bias 2016
  5. Bolukbasi, Tolga et al. Quantifying and Reducing Stereotypes in Word Embeddings ICML Workshop on #Data4Good 2016
  6. Deng, Jia, et al. Imagenet: A large-scale hierarchical image database.CVPR 2009
  7. Lipton, Zachary C., The Mythos of Model Interpretability. ICML Workshop on Human Interpretability of Machine Learning 2016)
  8. Kleinberg, Jon et al., Inherent Trade-Offs in the Fair Determination of Risk Scores

Original post. Reposted with permission.

Bio: Zachary C. Lipton is a PhD student in the Computer Science Engineering department at the University of California, San Diego. He is interested in both theoretical foundations and applications of machine learning. In addition to his work at UCSD, he has worked with Microsoft Research Redmond, Microsoft Research Bangalore, and Amazon Core Machine Learning.