Weak and Strong Bias in Machine Learning

With the arrival of the GDPR there has been increased focus on non-discrimination in machine learning. This post explores different forms of model bias and suggests some practical steps to improve fairness in machine learning.



By Samuel Cheadle, Oxford.

Classifiers often assign different probabilities of success (e.g. the probability of being offered a loan) to different demographic groups, but does this always constitute bias or discrimination? Is there ever a situation in which this is justified? This post aims to tease apart different forms of bias and argues that a more nuanced approach to identifying unethical practice is needed.

The preliminary (non-legally binding) section of the recent GDPR states that data controlers are required to “implement appropriate technical and organizational measures” to prevent “discriminatory effects” on the basis of sensitive personal data, including racial or ethnic origin, political opinion, religious or philosophical belief, data concerning health or data concerning a natural person’s sex life or sexual orientation (Recital 71).

Discrimination can be defined as the unfair treatment of an individual, but demonstrating that an automated decision system is truly nondiscriminatory is often difficult or impossible.

Machine learning - discrimination

What are the first steps to prevent bias in a machine learning system? Of course we can simply prevent any sensitive characteristics ever being explicitly available to our model at the time of training, but unfortunately this will never ensure unbiased and fair treatment of different demographic groups. So called ‘proxy variables’ that correlate with sensitive characteristics are ubiquitous in many training datasets and often result in what may be construed as discrimination. Take, for instance, a loan decision system based on home postcode. This geodemographic information can correlate surprisingly strongly with sensitive characteristics such as ethnicity.

Does this mean that we should ‘throw out” any variable that displays a significant correlation with sensitive characteristics (maximal interpretation of the GDPR outlined here), or adjust for differential outcomes using a form of demographic parity, in order to comply with the GDPR? This proposal is likely to meet with “howls of protest” from the machine learning community as it is likely to significantly impair the performance (utility) of the model.

Unfortunately understanding the underlying reasons for correlations between proxy variables and sensitive characteristics can be tricky (to put it mildly). Social scientists have been investigating the driving factors that connect things like social class and educational attainment for many decades, often with little emerging consensus.

I believe the prevention of bias in modern machine learning is a fundamental issue that affects us all, but an extreme interpretation of the GDPR guidelines would be self defeating. The public is right to be concerned about issues of fairness and transparency in modern machine learning, but demanding non-discrimination at all costs is unlikely to be effective. This is because such a restriction can dramatically reduce the utility of these decisions systems; this high cost will drive commercially focused AI and machine learning companies to side-step these much needed checks on fairness.

Machine learning bias

In the short term we need non-discrimination requirements that are workable and practical, rather than cripplingly severe. Surveying the recent literature on discrimination in machine learning reveals multiple types of bias that are important to distinguish before taking practical steps towards fairness. Based on the distinctions below it may be possible to eliminate “strong” before “weak” bias.

1) Legitimate proxy variables - Weak bias - Proxy variables can result in different groups (ethnicity, religion etc) being associated with different overall probabilities of “success”. Interpretation of this result as discrimination implies that the correlation between proxy variables and sensitive characteristics is unwarranted or unjustified. This may or may not be true and is difficult to test. However, one thing that we are able to examine is the nature of the proxy variable; was a relationship between the proxy variable and the outcome of interest hypothesised “a priori”, before training the model (e.g. predicting probability loan repayment based past number of loan defaults)? If so, then we should be cautious about inferring the existence of bias and may be justified to turn our attention to other “stronger” forms of bias.

2) Illegitimate proxy variables - Strong bias – If no relationship between the proxy variable and the outcome of interest was hypothesized a priori (e.g. hair colour predicting the probability of loan repayment) then we are right to be sceptical of the model’s decision. This type of non-hypothesised link is much more likely to reflect truly discriminatory decision making.

3) Complex combination of legitimate proxy variables - Strong bias - Classifiers may assign probabilities of “success” based on complex combinations of many input variables (especially in Deep Learning). These same complex combinations of variables may also reveal a person’s ethnicity, political opinion, gender etc. Wherever possible we should examine if these proxy variables have been combined and weighted in intuitive ways (consistent with a priori expectations). If not, this is further evidence of true discrimination.

4) Sampling limitations generating a “Uncertainty bias” - Strong bias - In some cases we simply have less data available, or less reliable data. If the model is also trained to be risk averse this can lead to unfavorable treatment of minority groups; the model is never wiling to “take a chance” on minority members because the associated risk is too high. One simple solution to this may be to incentivise the collection of more/better data for minority groups.

A consistent theme running through all types of discrimination analysis is the need for access to data on sensitive characteristics at some stage of the model evaluation process. Given the GDPR restrictions on the collection and storage of sensitive personal characters like race and religion, there needs to be some exploration of how best to achieve this, perhaps involving trusted third party repositories for storage of sensitive data.

I have tried to outline distinctions between some basic classes of model bias. There may be other ways of making these broad distinctions but I believe that practical steps towards increasing fairness can be achieved by prioritising the prevention of certain forms of bias, referred to above as “strong bias”; it is in these cases that we have the strongest evidence for discrimination, so steps towards improving fairness in machine learning should start here.

Bio: Samuel Cheadle – Statistician / Data Scientist with a keen interest in improving transparency and explainability in modern machine learning. I currently work at Oxford University evaluating the fairness of decision making systems used to select undergraduate applicants.

Related: