Machine Ethics and Artificial Moral Agents

This article is simply a stream of consciousness on questions and problems I have been thinking and asking myself, and hopefully, it will stimulate some discussion.

III. AI usage and the control problem

Everything we discussed so far was based on two implicit assumptions that we did not consider up to now: first, everyone is going to benefit from AI and everyone will be able and in the position to use it.

This might not be completely true though. Many of us will indirectly benefit from AI applications (e.g., in medicine, manufacturing, etc.) but we might live in the future in a world where only a handful of big companies drives the AI supply and offers fully functional AI services, which might not be affordable for everyone and above all not super partes.

AI democratization vs a centralized AI is a policy concern that we need to sort out today: if from one hand the former increases both the benefits and the rate of development but comes with all the risks associated with system collapse as well as malicious usages, the latter might be more safe but unbiased as well.

Should AI be centralized or for everyone?

The second hypothesis, instead, is that we will be forced to use AI with no choice whatsoever. This is not a light problem and we would need a higher degree of education on what AI is and can do for us to not be misled by other humans. If you remember the healthcare example we described earlier, this could be also a way to partially solve some problem in the accountability sphere. If the algorithm and the doctor have a contradictory opinion, you should be able to choose who to trust (and accepting the consequences of that choice).

The two hypothesis above described lead us to another problem in the AI domain, which is the Control Problem: if it is centralized, who will control an AI? And if not, how should it be regulated?

I wouldn’t be comfortable at all to empower any government or existing public entity with such a power. I might be slightly more favorable to a big tech company, but even this solution comes with more problems than advantages. We might then need a new impartial organization to decide how and when using an AI, but history teaches us we are not that good in forming mega impartial institutional players, especially when the stake is so high.

Regarding the AI decentralization instead, the regulation should be strict enough to deal with cases such as AI-to-AI conflicts (what happens when 2 AIs made by two different players conflict and give different outcomes?) or the ethical use of a certain tool (a few companies are starting their own AI ethics board) but not so strict to prevent research and development or full access to everyone.

I will conclude this section with a final question: I strongly believe there should be a sort of ‘red button’ to switch off our algorithms if we realize we cannot control it anymore. However, the question is who would you grant this power to?

Image Credit: TheDigitalWay


IV. AI safety and catastrophic risks

As soon as AI will become a commodity, it will be used maliciously as well. This is a virtual certainty. And the value alignment problem showed us that we might get in trouble due to a variety of different reasons: it might be because of misuses (misuse risks), because of some accident (accident risks), or it could be due to other risks.

But above all, no matter the risk we face, it looks that AI is dominated by some sort of exponential chaotic underlying structure and getting wrong even minor things could turn into catastrophic consequences. This is why is paramount to understand every minor nuance and solve them all without underestimating any potential risk.

Amodei et al. (2016) actually dug more into that and drafted a set of five different core problems in AI safety:

  1. Avoiding negative side effects;
  2. Avoiding reward hacking;
  3. Scalable oversight (respecting aspects of the objective that are too expensive to be frequently evaluated during training);
  4. Safe exploration (learning new strategies in a non-risky way);
  5. Robustness to distributional shift (can the machine adapt itself to different environments?).

This is a good categorization of AI risks but I’d like to add the interaction riskas fundamental as well, i.e., the way in which we interact with the machines. This relationship could be beneficial (see the Paradigm 37–78) but comes with several risks as well, as for instance the so-called dependence threat, which is a highly visceral dependence of human on smart machines.

A final food for thought: we are all advocating for full transparency of methods, data and algorithms used in the decision-making process. I would also invite you though to think that full transparency comes with the great risk of higher manipulation. I am not simply referring to cyber attacks or bad-intentioned activities, but more generally to the idea that once the rules of the game are clear and the processes reproducible, it is easier for anyone to hack the game itself.

Maybe companies will have specific departments in charge of influencing their own or their competitors’ algorithms, or there will exist companies with the only scope of altering data and final results. Just think about that…

Image Credit: Sergey Nivens/Shutterstock


Bonus Paragraph: 20 research groups on AI ethics and safety

There are plenty of research groups and initiatives both in academia and in the industry start thinking about the relevance of ethics and safety in AI. The most known ones are the following 20, in case you like to have a look at what the are doing:

Finally, Google has just announced the People+AI research (PAIR) initiative, which aims to advance the research and design of people-centric AI systems.

Image Credit: Zapp2Photo/Shutterstock



Absurd as it might seem, I believe ethics is a technical problem. Writing this post, I realized how much little I know and even understand about those topics. It is incredibly hard to have a clear view and approach on ethics in general, let’s not even think about the intersection of AI and technology. I didn’t even touch upon other questions that should keep AI experts up at night (e.g., unemployment, security, inequality, universal basic income, robot rights, social implications, etc.) but I will do in future posts (any feedback would be appreciated in the meantime).

I hope your brain is melting down as mine in this moment, but I hope some of the above arguments stimulated some thinking or ideas regarding new solutions to old problems.

I am not concerned about robots taking over or Skynet terminates us all, but rather of humans using improperly technologies and tools they don’t understand. I think that the sooner we clear up our mind around those subjects, the better it would be.


  • Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D. (2016). “Concrete Problems in AI Safety”. arXiv:1606.06565v2.
  • Dietvorst, B. J., Simmons, J. P., Massey, C. (2015). “Algorithm aversion: People erroneously avoid algorithms after seeing them err”. Journal of Experimental Psychology 144(1): 114–126.
  • Dietvorst, B. J., Simmons, J. P., Massey, C. (2016). “Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them”.Available at SSRN: or

Bio: Francesco Corea is a Decision Scientist and Data Strategist based in London, UK.

Original. Reposted with permission.