Machine Ethics and Artificial Moral Agents

This article is simply a stream of consciousness on questions and problems I have been thinking and asking myself, and hopefully, it will stimulate some discussion.

Header image
Image Credit: andruxevich/Shutterstock

There has been a lot of talk over the past months about AI being our best or worst invention ever. The chance of robots taking over and the following catastrophic sci-fi scenario makes the ethical and purposeful design of machines and algorithms not simply important but necessary.

But the problems do not end here. Incorporating ethical principles into our technology development process should not just be a way to prevent human race extinction but also a way to understand how to use the power coming from that technology responsibly.

This article does not want to be a guide for ethics for AI or setting the guidelines for building ethical technologies. It is simply a stream of consciousness on questions and problems I have been thinking and asking myself, and hopefully, it will stimulate some discussion.

Now, let’s go down the rabbit-hole...

Image Credit: phloxii/Shutterstock


I. Data and biases

The first problem everyone raises when speaking about ethics in AI is, of course, about data. Most of the data we produce (if we exclude the ones coming from observation of natural phenomena) are artificial creations of our minds and actions (e.g., stock prices, smartphone activity, etc.). As such, data inherit the same biases we have as humans.

First of all, what is a cognitive bias? The (maybe controversial) way I look at it is that a cognitive bias is a shortcut of our brain that translates into behaviors which required less energy and thought to be implemented. So, a bias is a good thing to me, at least in principle. The reason why it becomes a bad thing is that the external environment and our internal capacity to think do not proceed pari passu. Our brain gets trapped into heuristics and shortcuts which could have resulted into competitive advantages 100 years ago but is not that plastic to quickly adapt to the change of the external environment (I am not talking about a single brain but rather on a species level).

In other words, the systematic deviation from a standard of rationality or good judgment (this is how bias is defined in psychology) is nothing more for me than a simple evolutionary lag of our brain.

Why all this excursus? Well, because I think that most of the biases data embed comes from our own cognitive biases (at least for data resulting from human and not natural activities). There is, of course, another block of biases which stems from pure statistical reasons (the expected value is different from the true underlying estimated parameter). Kris Hammond of Narrative Science merged those two views and identified at least five different biases in AI. In his words:

  • Data-driven bias (bias that depends on the input data used);
  • Bias through interaction;
  • Similarity bias (it is simply the product of systems doing what they were designed to do);
  • Conflicting goals bias (systems designed for very specific business purposes end up having biases that are real but completely unforeseen);
  • Emergent bias (decisions made by systems aimed at personalization will end up creating bias “bubbles” around us).

But let’s go back to the problem. How would you solve the biased data issue then?

Simple solution: you can try to remove any data that could bias your engine ex-ante. Great solution, it will require some effort at the beginning, but it might be feasible.

However, let’s look at the problem from a different angle. I was educated as an economist, so allow me to start my argument with this statement: let’s assume we have the perfect dataset. It is not only omni-comprehensive but also clean, consistent and deep both longitudinally and temporally speaking.

Even in this case, we have no guarantee AI won’t learn the same bias autonomously as we did. In other words, removing biases by hand or by construction is not a guarantee of those biases to not come out again spontaneously.

We have no guarantee AI won’t learn the same bias autonomously as we did.

This possibility also raises another (philosophical) question: we are building this argument from the assumption that biases are bad (mostly). So let’s say the machines come up with a result we see as biased, and therefore we reset them and start again the analysis with new data. But the machines come up with a similarly ‘biased result’. Would we then be open to accepting that as true and revision what we consider to be biased?

This is basically a cultural and philosophical clash between two different species.

In other words, I believe that two of the reasons why embedding ethics into machine designing is extremely hard is that i) we don’t really know unanimously what ethics is, and ii) we should be open to admit that our values or ethics might not be completely right and that what we consider to be biased is not the exception but rather the norm.

Developing a (general) AI is making us think about those problems and it will change (if it hasn’t already started) our values system. And perhaps, who knows, we will end up learning something from machines’ ethics as well.

Image Credit: Notre Dame of Maryland University Online


II. Accountability and trust

Well, now you might think the previous one is a purely philosophical issue and that you probably shouldn’t care about it. But the other side of the matter is about how much you trust your algorithms. Let me give you a different perspective to practically looking at this problem.

Let’s assume you are a medical doctor and you use one of the many algorithms out there to help you diagnose a specific disease or to assist you in a patient treatment. In the 99.99% of the time the computer gets it right — and it never gets tired, it analyzed billions of records, it sees patterns that a human eye can’t perceive, we all know this story, right? But what if in the remaining o.o1% of the case your instinct tells you something opposite to the machine result and you end up to be right? What if you decide to follow the advice the machine spit out instead of yours and the patient dies? Who is liable in this case?

But even worse: let’s say in that case you follow your gut feeling (we know is not gut feeling though, but simply your ability to recognize at a glance something you know to be the right disease or treatment) and you save a patient. The following time (and patient), you have another conflict with the machine results but strong of the recent past experience (because of an hot-hand fallacy or an overconfidence bias) you think to be right again and decide to disregard what the artificial engine tells you. Then the patient dies. Who is liable now?

The question is quite delicate indeed and the scenarios in my head are:

a) a scenario where the doctor is only human with no machine assistance. The payoff here is that liability stay with him, he gets it right 70% of the time, but the things are quite clear and sometimes he gets right something extremely hard (the lucky guy out of 10,000 patients);
b) a scenario where a machine decides and gets it right 99.99% of the time. The negative side of it is an unfortunate patient out of 10,000 is going to die because of a machine error and the liability is not assigned to either the machine or the human;
c) a scenario the doctor is assisted but has the final call to decide whether to follow the advice. The payoff here is completely randomized and not clear to me at all.

As a former economist, I have been trained to be heartless and reason in terms of expected values and big numbers (basically a Utilitarian), therefore scenario b) looks the only possible to me because it saves the greatest number of people. But we all know is not that simple (and of course doesn’t feel right for the unlucky guy of our example): think about the case, for instance, of autonomous vehicles that lose controls and need to decide if killing the driver or five random pedestrians (the famous Trolley Problem). Based on that principles I’d save the pedestrians, right? But what about all those five are criminals and the driver is a pregnant woman? Does your judgement change in that case? And again, what if the vehicle could instantly use cameras and visual sensors to recognize pedestrians’ faces, connect to a central database and match them with health records finding out that they all have some type of terminal disease? You see, the line is blurring...

The final doubt that remains is then not simply about liability (and the choice between pure outcomes and ways to achieve them) but rather on trusting the algorithm (and I know that for someone who studied 12 years to become doctor might not be that easy to give that up). In fact, algorithm adversion is becoming a real problem for algorithms-assisted tasks and it looks that people want to have an (even if incredibly small) degree of control over algorithms (Dietvorst et al., 2015; 2016).

But above all: are we allowed to deviate from the advice we get from accurate algorithms? And if so, in what circumstances and to what extent?

Are we allowed to deviate from the advice we get from accurate algorithms?

If an AI would decide on the matter, it will also probably go for scenario b) but we as humans would like to find a compromise between those scenarios because we ‘ethically’ don’t feel any of those to be right. We can rephrase then this issue under the ‘alignment problem’ lens, which means that the goals and behaviors an AI have need to be aligned with human values — an AI needs to think as a human in certain cases (but of course the question here is how do you discriminate? And what’s the advantage of having an AI then? Let’s therefore simply stick to the traditional human activities).

In this situation, the work done by the Future of Life Institute with the Asilomar Principles becomes extremely relevant.

The alignment problem, in fact, also known as ‘King Midas problem’, arises from the idea that no matter how we tune our algorithms to achieve a specific objective, we are not able to specify and frame those objectives well enough to prevent the machines to pursue undesirable ways to reach them. Of course, a theoretically viable solution would be to let the machine maximizing for our true objective without setting it ex-ante, making therefore the algorithm itself free to observe us and understand what we really want (as a species and not as individuals, which might entail also the possibility of switching itself off if needed).

Sounds too good to be true? Well, maybe it is. I indeed totally agree with Nicholas Davis and Thomas Philbeck from WEF that in the Global Risks Report 2017 wrote:

“There are complications: humans are irrational, inconsistent, weak-willed, computationally limited and heterogeneous, all of which conspire to make learning about human values from human behaviour a difficult (and perhaps not totally desirable) enterprise”.

What the previous section implicitly suggested is that not all AI applications are the same and that error rates apply differently to different industries. Under this assumption, it might be hard to draw a line and design an accountability framework that does not penalize applications with weak impact (e.g., a recommendation engine) and at the same time do not underestimate the impact of other applications (e.g,., healthcare or AVs).

We might end up then designing multiple accountability frameworks to justify algorithmic decision-making and mitigate negative biases.

Certainly, the most straightforward solution to understand who owns the liability for a certain AI tool is thinking about the following threefold classification:

  • We should hold the AI system itself as responsible for any misbehavior(does it make any sense?);
  • We should hold the designers of the AI as responsible for the malfunctioning and bad outcome (but it might be hard because usually AI teams might count hundred of people and this preventative measure could discourage many from entering the field);
  • We should hold accountable the organization running the system (to me it sounds the most reasonable between the three options, but I am not sure about the implications of it. And then what company should be liable in the AI value chain? The final provider? The company who built the system in the first place? The consulting business which recommended it?).

There is not an easy answer and much more is required to tackle this issue, but I believe a good starting point has been provided by Sorelle Friedler and Nicholas Diakopoulos. They suggest to consider accountability through the lens of five core principles:

  • Responsibility: a person should be identified to deal with unexpected outcomes, not in terms of legal responsibility but rather as a single point of contact;
  • Explainability: a decision process should be explainable not technically but rather in an accessible form to anyone;
  • Accuracygarbage in, garbage out is likely to be the most common reason for the lack of accuracy in a model. The data and error sources need then to be identified, logged, and benchmarked;
  • Auditability: third parties should be able to probe and review the behavior of an algorithm;
  • Fairness: algorithms should be evaluated for discriminatory effects.

Image Credit: mcmurryjulie/Pixabay