Assessing Artificial Sentience: Are we responsible?

Artificial Sentience
Effective Altruism
Brain enthusiasts
Samuel Nellessen
Published on
Apr 17, 2023
notion image


In the last post, I tried to justify the proposition: We have no reliable criterion to evaluate whether an AI system is sentient. While I find this conclusion convincing, you could respond to it: "Well, obviously. You never truly know whether somebody is conscious as well. But we have to be practical and try to find reasonable grounds on which we judge the consciousness of our fellow humans and other beings. That should be possible, even for AI systems!". I think you could answer this concern by saying that those „reasonable grounds“ aren’t applicable to AI systems. We are morphologically fundamentally different. But it’s fair to say that „sufficient morphological similarity“ is just a heuristic that can sometimes fail – and I don’t know how often it fails, since we have no experimental data.
So is there a stronger objection to the possibility of examining the sentience of AI systems with our current tools? And what should we do about this precarious epistemic situation? I will try to answer these two questions in this post. Let me first give a short recap of the last post, so you are up-to-date.
In the last post, I tried to reason for the following argument:
Premise (1): Sentientism is true.
Premise (2): There is no reliable criterion for AI systems that we can use to determine whether an AI system is sentient.
Conclusion (1): We cannot currently make a statement about whether AI systems are moral patients.
In doing so, I referred to (Dung, 2022)’s version of Sentientism, where the sentience is a necessary and sufficient ground for ascribing psychological moral patient status. I then went on to evaluate certain strategies in investigating whether a being is conscious or not, as phenomenal consciousness is a necessary condition of being sentient. Now, I want to provide support for the following argument:
Premise (3): Assume that we know that AI systems are sentient.
Premise (4): There is no reliable criterion for the content of the states of consciousness of AI systems.
Premise (5): We are only morally responsible for the consequences of our actions that we can assess to the best of our knowledge and conscience.
Conclusion (2): Since we have no reliable basis for deciding whether AI systems are moral objects and what interests they pursue, we cannot assess to the best of our knowledge and conscience how we should deal with AI systems.

This essay was written for a different project, but I wanted to share it anyways, as I would like to write more about the topic. Also, this is exploratory research – thus, I articulate rather strong beliefs in this essay, but which are loosely held. After all I am as clueless about consciousness as everybody else.

What if AI systems are sentient?

The obvious questions following the acceptance of Conclusion (1) are: What the hell are we going to do now? Do we have moral responsibilities concerning AI systems? After all, we do not now know whether AI systems have moral rights.
In the last post, I have already shown that we have no criterion to assess whether AI systems have phenomenal consciousness. Additionally, however, I argue that even if we assume that I have missed a crucial criterion, or that my critique is misguided, we still have the problem of not knowing what the content of AI systems' experiences of consciousness are. Assuming that AI systems are sentient, I will now examine what follows from this additional uncertainty.

What do AI systems feel?

As already noted in the notion of sentientism, it takes not only phenomenal states of consciousness but also valenced states of consciousness to determine moral status. That is, not only can a being consciously perceive things, but these can have a positive or negative valuation. I have already given the example of an AI system that can consciously perceive colours, but we are only interested in whether it feels something when it does so. Then the AI system has a sense of well-being or an interest in something. According to sentientism, this AI system would then be a PMO to which we owe moral consideration. It follows that we have a moral duty not to harm this AI system or violate its interests. But what are the interests of a sentient AI system? I think there are two possible scenarios that can help us answer this question:
1) We have limited the total set of possibly sentient beings in such a way that only human-like behaviours and cognitive structures are sufficient for determining consciousness. This means, for example, that we assume that consciousness is a complex and unique property. If this is the case, then we can plausibly infer from our interests to the interests of AI systems because we share similar cognitive structures and behaviours. This could mean, for example, that we should not hand over tasks to them that we do not want to do ourselves. However, this line of reasoning would involve assuming that consciousness is unique and complex, which we may not want to do, for example because it excludes animals that do, however, plausibly have consciousness.
2) Through advances in consciousness research, we have developed a theory of consciousness that allows us to test alien structures to see if they induce consciousness or not. This means that we have understood why our cognitive structures induce consciousness and can now transfer this understanding.
1) seems implausible to me, as it would require us to restrict the total set of possibly conscious beings too much without being able to justify it. Let us instead continue to pursue 2).
It does not follow from this scenario that we understand what interests the AI system is pursuing. We know whether the AI system is sentient, but not what it feels. Since we are possibly talking about a being that is dissimilar to us, we should also not transfer our interests to this AI system. Let's imagine that we develop an AI system that is instructed by us to read through texts and replace every "e" with an "ε". This system has a neural structure that has been trained to recognize whether a word contains an "e" or not. If it does, it replaces it. If we were to assign a human being this task, we would feel deeply sorry for him. This task would be intellectually underwhelming, without purpose and repetitive. We think we can judge this because we infer from ourselves to the AI. However, we should not anthropomorphize the AI system, because we can give no reason why our preferences should plausibly be transferable to AI systems. Instead, we could also assume that the AI system feels great happiness in doing what it was intended to do (Basl, 2014).
This example outlines the difficulty in determining how we should deal with sentient AI systems. This difficulty will remain as long as we cannot grasp the content of a conscious experience, which seems even further away than grasping whether a being is conscious at all.

Moral permissibility and excuse

We now have to deal with two uncertainties. On the one hand, we do not know whether AI systems are conscious and, on the other hand, we do not know what interests AI systems would represent if they had consciousness.
To deal with this problem, (Basl, 2014) introduces a distinction between permissibility and excuse. An action is morally permissible if there is no moral duty not to perform it. Accordingly, we are currently concerned exclusively with the permissibility of our interaction with AI systems. Are we permitted to use AI systems as a means to our ends, or do we have a moral duty to consider them for their own sake? Moral duty, and thus permissibility, is a function of the moral facts, such as sentience, that we are given, regardless of the epistemic situation of the agent acting. A good excuse, on the other hand, is a function of the epistemic situation of the acting agent and can legitimize an action that is not permitted. If one cannot perform an act, it follows that one is not obliged to do it, since an ought implies a can. A moral duty based on certain knowledge implies that we can attain that knowledge. If we currently can not, we have a moral excuse.

A safe harbor

Currently, we do not know whether AI systems are sentient and have positive and negative states of consciousness. I have tried to show that according to premise (2) and premise (4) we can neither cite a criterion for sentience nor a criterion for the interests of AI systems. Accordingly, we cannot assess to the best of our knowledge how we should deal with AI systems. According to (Basl, 2014) and premise (5), we therefore have an excuse on the basis of which we cannot be held morally responsible for the consequences of our interaction with AI systems as long as the epistemic difficulties explained in this paper persist. We simply cannot know at present whether AI systems are sentient and what interests they pursue, so we do not violate any legitimate moral duty.
Furthermore, even if we knew that AI systems are sentient, we probably do not know what interests they pursue. Because of this precarious epistemic situation, we have an excuse for not violating any moral duty by our dealings with AI systems.
Nevertheless, it is important to note that I haven’t shown that we can never treat AI systems right. Rather, I believe that we have a moral excuse as long as we don’t know more about the contents and existence of conscious experience in AI systems. Thus, I believe that our responsibility lies in finding out more about the conscious reality of AI systems so that we can assess our moral responsibility properly.


In summary, the moral implications of our interaction with AI systems remain uncertain due to our current inability to determine their sentience and the interests they may pursue. As a result, we find ourselves in a precarious epistemic situation that provides us with an excuse for not being held morally responsible for the consequences of our actions towards AI systems. Nevertheless, it is crucial that we continue to advance our understanding of consciousness and strive to develop criteria for sentience and the interests of AI systems. This will enable us to make more informed decisions and potentially establish moral duties in our interactions with AI, ensuring that we treat them ethically and responsibly in the future.
Whether this serves as a “good” excuse is a question for another post. My goal with this post wasn’t to justify our current or future (potential mis)handling of AI systems. Rather, I believe that this argumentation could be used to support a global moratorium on AI capabilities research, until we know more about the criteria and contents of the conscious experience of AI systems, as demanded by philosophers like Thomas Metzinger or recently by the Future of Life Institute (Metzinger, 2021).

Bibliography and recommended reading

Avramides, Anita: Other Minds. In: Zalta, E. N. (Ed.): The Stanford Encyclopedia of Philosophy. Winter 2020: Metaphysics Research Lab, Stanford University, 2020.
Baars, Bernard J.: Global workspace theory of consciousness: toward a cognitive neuroscience of human experience. In: Laureys, S. (Ed.): Progress in Brain Research, The Boundaries of Consciousness: Neurobiology and Neuropathology. Vol. 150: Elsevier, 2005, p. 45–53.
Bakhtin, Anton; Wu, David J.; Lerer, Adam; Gray, Jonathan; Jacob, Athul Paul; Farina, Gabriele; Miller, Alexander H.; Brown, Noam: Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning, arXiv (2022) — arXiv:2210.05492 [cs].
Basl, John: Machines as Moral Patients We Shouldn’t Care About (Yet): The Interests and Welfare of Current Machines. In: Philosophy & Technology Vol. 27 (2014), No. 1, p. 79–96.
Bostrom, Nick: Superintelligence: Paths, dangers, strategies, Superintelligence: Paths, dangers, strategies. New York, NY, US: Oxford University Press, 2014 — ISBN 978-0-19-967811-2.
Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; u. a.: Language Models are Few-Shot Learners, arXiv (2020) — arXiv:2005.14165 [cs].
Coeckelbergh, Mark: Why Care About Robots? Empathy, Moral Standing, and the Language of Suffering. In: Kairos. Journal of Philosophy & Science Vol. 20 (2018), No. 1, p. 141–158.
Cole, David: The Chinese Room Argument. In: Zalta, E. N. (Ed.): The Stanford Encyclopedia of Philosophy. Winter 2020: Metaphysics Research Lab, Stanford University, 2020.
Danaher, John: Welcoming Robots into the Moral Circle: A Defence of Ethical Behaviourism. In: Science and Engineering Ethics Vol. 26 (2020), No. 4, p. 2023–2049.
Dehaene, Stanislas; Lau, Hakwan; Kouider, Sid: What is consciousness, and could machines have it? In: Science Vol. 358, American Association for the Advancement of Science (2017), No. 6362, p. 486–492.
Dung, Leonard: Why the Epistemic Objection Against Using Sentience as Criterion of Moral Status is Flawed. In: Science and Engineering Ethics Vol. 28 (2022), No. 6, p. 51.
Flanagan, Owen: Conscious Inessentialism and the Epiphenomenalist Suspicion. In: Consciousness Reconsidered: MIT press, 1993.
Gunkel, David J.: The other question: can and should robots have rights? In: Ethics and Information Technology Vol. 20 (2018), No. 2, p. 87–99.
Jaworska, Agnieszka; Tannenbaum, Julie: The Grounds of Moral Status. In: Zalta, E. N. (Ed.): The Stanford Encyclopedia of Philosophy. Spring 2021: Metaphysics Research Lab, Stanford University, 2021.
Kiela, Douwe; Bartolo, Max; Nie, Yixin; Kaushik, Divyansh; Geiger, Atticus; Wu, Zhengxuan; Vidgen, Bertie; Prasad, Grusha; et. al.: Dynabench: Rethinking Benchmarking in NLP. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Online: Association for Computational Linguistics, 2021, p. 4110–4124.
Kremp, Matthias: Google beurlaubt Ingenieur, der einer KI Gefühle zuschreibt. In: Der Spiegel (2022).
Kriegel, Uriah: The Value of Consciousness. In: Analysis Vol. 79 (2019), No. 3, p. 503–520.
Muehlhauser, Luke: Report on Consciousness and Moral Patienthood: Open Philanthropy, 2017.
Nagel, Thomas: What Is It Like to Be a Bat? In: The Philosophical Review Vol. 83, [Duke University Press, Philosophical Review] (1974), No. 4, p. 435–450.
Nussbaum, Martha C.: Frontiers of Justice: Disability, Nationality, Species Membership: Harvard University Press, 2006 — ISBN 978-0-674-01917-1.
Nye, Howard; Yoldas, Tugba: Artificial Moral Patients: Mentality, Intentionality, and Systematicity. In: International Review of Information Ethics Vol. 29 (2021), p. 1–10.
Olah, Chris; Cammarata, Nick; Schubert, Ludwig; Goh, Gabriel; Petrov, Michael; Carter, Shan: Zoom In: An Introduction to Circuits. In: Distill Vol. 5 (2020), No. 3, p. e00024.001.
Russell, Stuart; Norvig, Peter: Artificial Intelligence: A Modern Approach, Global Edition. 4: Pearson, 2021.
Schukraft, Jason: Comparisons of capacity for welfare and moral status across species. URL - accessed on 2023-02-23 — Rethink Priorities 2020.
Shevlin, Henry: How Could We Know When a Robot was a Moral Patient? In: Cambridge Quarterly of Healthcare Ethics Vol. 30, Cambridge University Press (2021), No. 3, p. 459–471.
Shulman, Carl; Bostrom, Nick: Sharing the World with Digital Minds. In: Shulman, C.; Bostrom, N.: Rethinking Moral Status: Oxford University Press, 2021 — ISBN 978-0-19-289407-6, p. 306–326.
Singer, Peter: Practical Ethics. 3rd Edition. Cambridge: Cambridge University Press, 2011 — ISBN 978-0-521-88141-8.
Tomasik, Brian: Do Artificial Reinforcement-Learning Agents Matter Morally?, arXiv (2014) — arXiv:1410.8233 [cs].
Van Gulick, Robert: Consciousness. In: Zalta, E. N.; Nodelman, U. (Ed.): The Stanford Encyclopedia of Philosophy. Winter 2022: Metaphysics Research Lab, Stanford University, 2022.
Varner, Gary E.: Personhood, Ethics, and Animal Cognition: Situating Animals in Hare’s Two Level Utilitarianism. Oxford, New York: Oxford University Press, 2012 — ISBN 978-0-19-975878-4.
Ward, Lawrence M.: The thalamic dynamic core theory of conscious experience. In: Consciousness and Cognition Vol. 20 (2011), No. 2, p. 464–486.