Why we can not (yet) say anything about the sentience of AI systems

TL;DR: Buckle up as we dive into the metaphysical world of phenomenal consciousness and Artificial Sentience. With this essay, I want to explore if we can currently make any progress on this journey. I argue that there is no reliable criterion to determine whether an AI system is sentient. Various approaches to predict AI sentience are examined, but all are found to be based on implausible speculation. The essay focuses on the properties approach, which considers psychological attributes like autonomy or sentience to determine moral status. It also highlights the challenges in determining the sentience of AI systems due to their diverse nature.

Introduction

If you read any paper, introduction or essay about artificial intelligence, you probably heard something along the following lines:

“There has been enormous progress in the development of artificial intelligence in recent years. For example, individual AI systems are now better at many types of tasks than the human benchmark (Kiela et al., 2021). This includes progress in areas where it was previously assumed that the human benchmark was more difficult to achieve, such as to achieve, such as in the strategy game Diplomacy, for which sophisticated strategic thinking and cooperation with other players is needed (Bakhtin et al., 2022). If this trend continues, AI systems could become as capable or more capable than humans in the future.“

If that something you are reading has anything to do with the (ethical) implications of AI systems, you were then presented with questions such as “Are AI systems allowed to kill?”, “Who is responsible when autonomous driving systems cause accidents?”, or if you are an AI Alignment fella “How the hell do we align these super-capable systems with human values?!” (the difference in sentiment is on purpose). These are questions that address the consequences of AI systems for us. The alignment problem as well as current safety questions all fall into this category.

But the field of the study of artificial sentience or suffering rather concerns itself with questions that regard AI systems themselves, such as “Should we morally consider AI systems?”. So while the relevance of the consequences of AI systems for us increases with the growing capabilities of these systems, the relevance of our influence on these systems increases in parallel¹. There is a risk of neglecting AI systems that may be more sentient than humans and therefore have more claims to moral considerations than we do (if we broadly accept a utilitarian framework) (Shulman and Bostrom, 2021). Bostrom and Shulman call these moral patients “super-patients” or “super-beneficiaries” (there is a difference between those terms, though).

Sentientism answers the question of what basis is necessary to ascribe a moral status to a being by saying that sentience is a necessary and sufficient ground (Dung, 2022). But we encounter severe epistemic difficulties in examining the sentience of an AI system. We have no criterion with which we can examine if an AI system is sentient, let alone what the contents of their preferences are.

The topic and question of this essay is to investigate whether we can make statements about these two aspects. How do we know whether an AI system is sentient and what interests do we attribute to a sentient AI system that may be fundamentally different from us? I start from the thesis that we can currently neither judge whether AI systems are sentient. In a companion post I will evaluate the second question, namely which interests they pursue, if they are known to be sentient. Thus, in the remainder of this essay, I will justify and defend the following argument:

Premise (1): Sentientism is true.

Premise (2): There is no reliable criterion for AI systems that we can use to determine whether an AI system is sentient.

Conclusion (1): We cannot currently make a statement about whether AI systems are moral patients.

In doing so, I refer to (Dung, 2022) for an account of sentientism, to (Danaher, 2020; Muehlhauser, 2017; Shevlin, 2021) for an account of the various criteria of sentience².

This essay was written for a different project, but I wanted to share it anyways, as I would like to write more about the topic. Also, this is exploratory research – thus, I articulate rather strong beliefs in this essay, but which are loosely held. After all I am as clueless about consciousness as everybody else.

Can we say anything about whether AI systems are sentient?

Sentientism is true.

When is a being a moral patient? According to (Jaworska and Tannenbaum, 2021) it is when it is intrinsically morally relevant. As moral agents, we have moral duties towards this being (Nye & Yoldas, 2021). When we violate these, we violate the moral rights of that being. As noted earlier, (Shevlin, 2021) distinguishes between normal and psychological moral patients, whereby a psychological moral patient is assigned this status by virtue of possessing certain psychological properties. Sentientism, as proposed by (Dung, 2022), claims that these psychological properties consist of sentience. Thus, sentience is a necessary and sufficient ground for us to call a being a PMP.

This implies, on the one hand, that a being that is a PMP must possess phenomenal consciousness and, on the other hand, that this being must be able to perceive negatively and positively valued states of consciousness (Dung, 2022). The former is true because sentience is identical with the ability to have phenomenal states of consciousness. It seems obvious that we mostly care about consciously felt pain. The latter is true because only valued states of consciousness are morally relevant. If we develop an AI system that can consciously perceive colours, that would be impressive, but not sufficient for it to enjoy moral consideration. This would be different if the AI system meanwhile felt a certain emotion, such as suffering or joy. Then, if we adopt Nagel’s position from his classic essay, being phenomenally conscious means for a being to have subjective experiences, i.e. “it feels like something” to be that being (Nagel, 1974). The additional constraint is that these subjective experiences have to feel like something negatively or positively.

The assumption of sentience as a necessary and sufficient ground for moral patient status seems plausible to me and advocated by many ethicists³. The defence of sentientism is not the aim of this paper, for this I refer to (Jaworska and Tannenbaum, 2021), (Nye and Yoldas, 2021) and (Kriegel, 2019). In the following, I assume the truth of sentientism.

There is no reliable criterion for the consciousness of AI systems.

Now, the question arises as to what it takes to determine whether an AI is sentient or not⁴. I argue that with our current resources and in the foreseeable future, we cannot determine whether an AI system is sentient or not. For that, we are looking for a criterion that tells us as reliably as possible whether a being is sentient or not. This means that the criterion should be as sensitive as possible – it is more important to us that the procedure tells us with certainty that a being is sentient than that it tells us with certainty that a being is not sentient⁵.

Furthermore, the distinction between reason and criterion is important (Dung, 2022). Accordingly, sentientism is a position that asserts that sentience is a necessary and sufficient reason to be PMP. A criterion, on the other hand, is a practical yardstick for deciding whether this reason is present. This distinction is useful because sentience is identical to the property of perceiving valued states of consciousness. However, we are still far from understanding what phenomenal consciousness is, and often describe states of consciousness as private and subjective (Van Gulick, 2022). Accordingly, it is possible that we exclude per definitionem that we can objectively describe the contents of our states of consciousness. If sentience is not identifiable on its own, we can still find criteria that reliably infer consciousness and sentience.

In the following, I will examine various criteria with which we can conclude the possession of phenomenal consciousness, since this is a necessary condition for sentience. In doing so, I will exemplarily consider the global workspace theory according to (Baars, 2005), ethical behaviourism according to (Danaher, 2020) and the cognitive equivalence strategy according to (Shevlin, 2021).

The global workspace theory

In short: According to the global workspace theory, a being has phenomenal consciousness if it possesses various sensor modules, each of which unconsciously determines sensor information. This sensor information is then projected onto a global workspace according to certain selection criteria, and is then used for further processing (Baars, 2005)⁶. The problem with the global workspace theory is that its criteria for consciousness are too liberal. (Tomasik, 2014) outlines this with an example: Let us imagine a robot that unconsciously collects information about its environment via sensors, forwards this information to a global workspace according to certain criteria, and then carries out further calculations with this information. According to the global workspace theory, this robot would have a phenomenal consciousness⁷.

However, most people would find this implausible. This example shows that theories of consciousness have too high a false positive rate when used to predict consciousness, and are therefore not useful investigative tools. They all suggest that some kind of information processing is necessary, but as a criterion for phenomenal consciousness this is an oversensitive criterion. Either we bite the bullet, like Tononi and Koch, and accept this situation: So what if consciousness is everywhere (Tononi and Koch, 2015)? Or we continue on our journey to find a meaningful theory of phenomenal consciousness and look for a criterion we can use in the present.

So far, it seems to me that any specification of consciousness of the currently advocated theories of consciousness could be implemented in an AI system without it being plausible that this AI system is conscious. This is an indication that our current theories of consciousness cannot yet explain the core of the phenomenon.

Another point to which I would assign less weight seems to be that there is a non-negligible chance that AI systems already today have a consciousness (Tomasik, 2014). However, these are partly AI systems that would not have consciousness according to the global workspace theory because they do not fulfil the necessary conditions. This point is speculative and not overly plausible in my view, but could be an indication that even common, non-specific theories of consciousness have a false-negative rate.

Ethical behaviourism

Since any current specification of consciousness falls sort of our criteria, we have to seek different candidates.

According to ethical behaviourism, if a being S1 exhibits roughly equivalent patterns of behaviour to the being S2, and if these patterns are believed to justify our attribution of moral rights to S2, then either (a) S1 must be attributed the same moral rights or (b) the use of the patterns of behaviour to justify our moral duties to S2 must be re-evaluated (Danaher, 2020).

An ethical behaviourist does not deny that sentience is the ultimate, metaphysical ground for determining the moral status of a being (Danaher, 2020). The similarity of a robot's response to negative affective stimuli is merely a sufficient criterion to ascribe the existence of the psychological property of sentience. In this way, the ethical behaviourist attempts to circumvent the problem of the other mind (Avramides, 2020) by creating a methodical and real-world verifiable criterion to infer the psychological reality of an entity via behaviour.

Here are the problems I see with this criterion:1) According to (Varner, 2012), comparative approaches (and thus ethical behaviourism) are based on analogical reasoning. According to this, it seems plausible that if S1 and S2 share a certain set of properties, but S1 has another property, S2 will also have it. We assume here that this further property is the possession of phenomenal consciousness. We draw up a list of relevant behaviours that we observe in humans that could serve as behavioural criteria. Then we investigate whether these behaviours are present in AI systems. If so, we assume that AI systems possess phenomenal consciousness, as we do and exhibit the same behaviour. A key problem, as (Varner, 2012) notes, is that we lack a guiding theory by which to decide whether a behaviour is relevant to allow this conclusion. Here is an example borrowed from (Varner, 2012):

Both turkeys and cattle are active during the day, do not eat meat and use their legs to move around.

Turkeys belong to the class of birds.

So cattle are also part of the class of birds.

No ethical behaviourist would seriously agree with this conclusion, yet it outlines the difficulty of the behavioural criterion. We see from this example that the behaviours mentioned are not relevant behaviours to determine whether something is a bird. Much more relevant here seems to be, for example, a certain way of eating. Similarly, when determining phenomenal consciousness, what behaviours are relevant to determine it? Is a reflex reaction to pain already a sign of consciousness? There is always the risk that we are too liberal or too chauvinistic in selecting the relevant behaviours, meaning that we ascribe phenomenal consciousness to too many or too few beings, depending on which behaviours we acclaim relevant.

2) The question arises as to what degree of similarity is sufficient. (Shevlin, 2021) gives the following example: If a certain stimulus is applied to a robot that has the function of negatively reinforcing a certain behaviour, but does not produce any outward signs of suffering, should we consider this stimulus to be equivalent to a punishment or other negative condition to which a human or an animal may be subjected? In this situation, there is a risk that we may consider a relevant behavioural criterion (such as responding to pain) to be met when in fact it is not. So at what point do we decide that there is sufficient similarity of behaviour? Again, the risk of being too liberal or too chauvinistic is eminent. To resolve this, we would need to test various behavioural criteria for plausibility (how plausibly can we infer the presence of phenomenal consciousness from their fulfilment?) and determine a degree of similarity. To do this, however, we would again need a theory to help us determine these two variables.

3) Ethical behaviourism has limited plausibility. It seems plausible that I apply the criterion of comparison when I infer from myself to other people. That is, under certain conditions, the inference from behaviour to a psychological trait is plausible, provided a certain similarity is present. If this similarity is not present, we cannot conclude that this being is not conscious. We must then remain agnostic about the psychological states of that being (Shevlin, 2021). If we do not do this, we would be implicitly making assumptions about the total set of phenomenally conscious beings, namely that only human or human-like behaviours imply phenomenal consciousness. Furthermore, human reactions to pain are also variable. For example, we notice some of our people's states of mind exclusively by modelling their voice pitch. It is at least conceivable that we are socially conditioned to express suffering through other behaviours. These are indications that our behaviours are only contingently linked to a phenomenal consciousness. Accordingly, it could be that an AI system, for example, behaves in a way that seems alien to us but is an expression of suffering for the system itself. With ethical behaviourism, we would overlook the phenomenal consciousness of this AI system.

4) The attribution of internal states based on similar behaviour is based on sufficient biological and evolutionary similarity. My roommate reacts similarly to me to a negative stimulus because we share a morphological similarity. This similarity increases the likelihood that he is also sentient. We are part of the same species and are adapted to a similar environment through evolution. Accordingly, it seems plausible that what is true for me is also true for him. It is similar with humans and animals. There we can assume a biological similarity and shared evolutionary history, which allows the conclusion from the similar behaviour to the existence of a phenomenal consciousness. This plausibility decreases with greater dissimilarity, e.g. if the last evolutionary common ancestor is longer ago. This biological and evolutionary similarity is not sufficiently present in AI systems. I do not claim that biological and evolutionary similarity is necessary for a being to be phenomenally conscious. Instead, I claim that it increases the plausibility of evaluating beings based on the behavioural criterion, as we use this criterion in this way to evaluate our fellow humans as conscious⁸'⁹.

5) The behaviour of AI systems can be deceptive and not genuine. We assume that an animal's response to pain is genuine – that is, it arises from an authentic desire, namely for the pain to stop. It was not programmed to exhibit this behaviour. However, AI systems could be developed or learn themselves to fulfil the specification of the behavioural criterion, such as aversive behaviour without suffering. Furthermore, AI systems could exhibit behaviour that meets the behavioural criterion for instrumental reasons, for example, to elicit sympathy or to enjoy strategic advantages¹⁰ (Bostrom, 2014, ch. 7). Then the behaviour would fulfil the behavioural criterion but would not be genuine.

In my estimation, 1), 2) and 3) are the most problematic. 1) and 2) show our cluelessness about the selection of relevant behaviours and a degree of similarity that comes from not having a guiding theory of consciousness. 3) shows that even if we do not have these problems, we can only talk about a subset of possibly conscious beings. This set is constituted by those beings that exhibit human-like behaviour, as well as physiological and evolutionary similarity. Within this set we can apply the behavioural criterion, since it gains plausibility here.

The cognitive equivalence strategy

This strategy states that we should treat an AI system as morally significant to the extent that we attribute to it, through certain cognitive structures, psychological capacities that are present in other beings to which we already attribute moral status. We draw up a list of relevant cognitive structures and dynamics based on which we should ascribe moral status to the being. However, similar problems to ethical behaviourism get in the way of this strategy:

1) We lack a guiding theory of consciousness that gives us clues as to which cognitive properties infer consciousness. However, if we had a theory of consciousness that would provide us with this, we would not need the cognitive equivalence strategy and could investigate being directly with this theory. Is metacognition relevant, or rather short-term memory? Does a conscious being need internal representations of the environment?

2) Even with this approach, we can only make statements about a subset of possible sentient beings. It seems plausible that a sufficient similarity of cognitive structures between two beings suggests the existence of phenomenal consciousness, but we should not disregard the fact that phenomenal consciousness could be induced by different cognitive structures that are foreign to us. Accordingly, we must continue to be agnostic about the existence of phenomenal consciousness in beings that have no cognitive similarity¹¹.

3) Some cognitive functions and dynamics can seemingly be performed in a conscious and unconscious state. (Muehlhauser, 2017) gives the example of a man who appears to be absent due to a stroke but still performs common behaviours that also require cognitive structures (such as intending to leave the room or drinking from a cup). He does this even though he is unconscious during the stroke. This limits the informative value of cognitive functioning about the presence of phenomenal consciousness, since it is not necessarily accompanied by consciousness.

4) Some postulated cognitive structures are also difficult to study. For example, suppose that internal representations of the environment or access to the semantics of language are criteria for consciousness. How can we determine whether ChatGPT, a currently popular AI language model, understands what it generates in terms of text or merely simulates this ability? This seems to be an important distinction when we talk about cognitive structures. This is reminiscent of the debate about weak and strong AI that John Searle launched with his Chinese Room thought experiment (Cole, 2020). According to this, the cognitive equivalence strategy does not solve our original problem, but possibly makes it more complicated.

5) One salvation may be to compare the cognitive complexity of two beings. For example, chimpanzees have significantly more complex brains than gazami crabs, which is why we can more convincingly assume that chimpanzees consciously feel pain. However, this is also only a good proxy criterion if we know what kind of complexity is relevant. Is GPT-3, a popular AI language model, with 175 billion parameters more complex than a chimpanzee's brain (Brown et al., 2020)? Depending on how we answer this, we would thereby possibly attribute consciousness to GPT-3, as it may be more complex than the brain of a chimpanzee, to which we plausibly attribute consciousness.

In my view, this approach also fails. Thus, the presence of certain cognitive structures increases the likelihood that an AI system has phenomenal consciousness, yet we cannot talk about beings that do not have human-like cognitive structures.

A combination of all approaches

Now, if none of the previously mentioned approaches alone helps us to verify phenomenal consciousness, one solution may be to combine all approaches. This idea is based on a Bayesian view. According to this, we collect data about our patient, much like a doctor, and use this data to estimate our uncertainty about whether the patient has a particular disease. According to this understanding, if all the above approaches predict this, the probability that a being has phenomenal consciousness would be greater than if each approach predicts this individually. We now need to test this.

Let us imagine that there is a total set of possibly phenomenally conscious beings. Within this total set are listed all taxa and AI systems that could theoretically possess phenomenal consciousness (in the sense that it is not analytically excluded that an element in the set possesses it), such as humans, other animals and plants. Ideally, we are looking for a method of investigation that can evaluate each element within this set and predict with certainty whether or not that element has phenomenal consciousness. By means of this picture, we can classify the approaches we know so far. Theories of consciousness, such as global workspace theory, attribute phenomenal consciousness to many, and possibly implausible, elements. Accordingly, we should not assign much weight to the judgement of these theories, as their predictions are diluted. The behaviourist and cognitive comparison criteria can only make statements about a subset of this total set, namely the beings that exhibit human-like characteristics. Therein lies the main problem with these criteria. By combining these two criteria, we can make more certain statements about human-like beings. For example, we can use them to rule out reflective action in the animal kingdom if we include not only behaviour but also cognitive structures (Shevlin, 2021). Yet we do not know what about these behaviours or cognitive structures suggests consciousness. As an example: Only a certain subset of the total has a thalamo-cortical system. Accordingly, it is more plausible that a being within this subset has phenomenal consciousness. However, as long as we do not know which property of this system is decisive for a phenomenal consciousness, it is not possible.

Thus, the combination of all approaches is useful if and only if AI systems develop human-like characteristics, i.e. cognitive structures and show similar behaviours. But then again we have to ask, at which point we claim that a cognitive structure and behaviour is relevant and at which degree of similarity of these things between us and an AI system we claim that they are conscious. As we lack a useful theory of phenomenal consciousness, I claim that we currently can not trust our intuitions. So, as long as we do not know which property of e.g. the thalamo-cortical system is decisive for phenomenal consciousness, we cannot allow ourselves to pass judgement on alien cognitive structures. Basically, we are creating a criterion with which we can check whether something is human-like, which also correlates with whether something is conscious. Ideally, however, we want a criterion that can also examine alien structures.

General Assumptions about the Total Set of Conscious Beings

However, this is based on the assumption that there are alien structures that could possibly induce consciousness. If, however, we could provide reasons to restrict the total set of conscious beings so that only human-like structures induce consciousness, we would have eliminated the above concerns. This requires a set of general assumptions about the total set of conscious beings (Muehlhauser, 2017). These assumptions can limit the total set in two directions, depending on how they are expressed. Either consciousness is a rarer property or a more common property. This essay is not the right place to present these assumptions in detail, so I only mention them briefly for the sake of completeness.

Conscious inessentialism according to (Flanagan, 1993). According to this, all consciously performed intelligent activities could be performed without consciousness. That is, any input-output relations that are assumed to imply consciousness could be performed without it¹². The assumption of conscious inessentialism would mean that the ethical behaviourist's criterion of comparison would lose plausibility. For we would then claim that the link between behaviour and consciousness is less likely to be necessary. Accordingly, we should be more careful when attributing phenomenal consciousness to an animal according to the behavioural criterion, since behaviour and consciousness are not necessarily linked. The assumption of conscious inessentialism would lead us to assume that phenomenal consciousness is rarer than we have assumed so far, since behaviour does not always indicate phenomenal consciousness. This would, for example, lead us to withdraw moral status from animals that we have in the meantime granted it on the basis of their behaviour.

The complexity of phenomenal consciousness. This is self-explanatory: the more complex a theory of consciousness assumes consciousness to be, the less likely it is that many beings possess this property. Accordingly, the total quantity of sentient beings would be more restricted, depending on which theory of consciousness one considers plausible and how complex this theory evaluates consciousness. This could also include the assumption that even the smallest deviations from the mechanisms that induce our consciousness (such as the thalamo-cortical system) could already lead to the absence of consciousness.

It is unclear how much the assumption of conscious inessentialism limits the set of possible sentient beings. After all, just because it is possible that all input-output relations are realizable without consciousness does not mean that they actually are. In particular, there could be functional reasons why consciousness is nevertheless useful.

Further, it is unclear how we are going to determine how complex phenomenal consciousness is without knowing what constitutes it. If we assume a high complexity of consciousness, we would restrict the total set so that only human or human-like behaviours and cognitive structures infer consciousness. By doing so, we would possibly exclude beings that are sentient and risk causing them suffering. At the same time, the cognitive structures and behaviours of humans and related animals would no longer be merely sufficient, but seemingly necessary. The criteria cited would thus gain in significance, since their strength lies in predicting consciousness, insofar as there is a similarity to humans.

We cannot say whether AI systems are moral patients.

Although I have made some arguments against each approach, I do not claim that they have been conclusive. Nevertheless, I have tried to show that we should not be satisfied with our approaches because they only plausibly address a subset of the total set of possibly conscious beings, namely beings with human-like behaviours and cognitive structures. We need partially implausible additional assumptions that constrain this set in such a way that human and human-like structures are necessary for consciousness.

However, if we would rather not exclude alien structures, I do not think we have a useful criterion to ensure this. We do not have a useful theory of consciousness that can identify relevant structures without being too abstract and oversensitive at the same time. Accordingly, it is not possible for us to invoke a criterion with which we can verify whether a being is sentient or not. According to (Dung, 2022), it does not follow from this that sentientism is false. We can further claim that sentience is the ultimate metaphysical ground sufficient and necessary for determining that a being is or is not a PMP. The critique I have presented in this essay is directed solely against the common criteria, not against the ground itself. If it is true that we cannot (currently) cite any criterion with which we can show that a being is conscious, it follows that we can not currently know for sure whether a being is sentient.

Sentientism Instrumentalism

I think that this critique reveals some form of Sentientism Instrumentalism. Namely, that the worth of an investigative tool, a criterion for phenomenal consciousness, is based on how effective this tool is at explaining or predicting whether something is conscious or not. Somebody could make the point that my points are valid, in the sense that we never truly know whether another being is conscious, but that this question is only relevant in tough and wearing philosophical discussions. But in practise, we somehow figure out a way to determine whether a being is conscious or not. For example, I don’t know whether my roommate is conscious or not – I am pretty sure he is, but me critique seems to suggest asking: BUT IS HE REALLY? While this is a legitimate question to ask, it is not a practically orientated one. So this critique seems to draw on the distinction between scientific realism and anti-realism as well: Do we want our theories to be truthful or just useful?

I want to briefly make the point that my critique is not of the mentioned annoying, philosophical, realistic nature. Whether it is, depends on how big we assume the set of all potentially phenomenally conscious beings to be. I tried to illustrate this point above. If we assume that said set is rather small, the combination of all approaches should be a useful and potentially truthful criterion! Then, my critique is of the annoying kind.

But if we assume that the set is much larger (indicated by the dotted lines) than we currently assume, we only talk about a vanishingly small part of the total set. Our criterion would only be useful when AI systems come into existence with substantial similarity to us. If this similarity does not hold, we shouldn’t believe that our criterion is at all useful, let alone truthful.

Conclusion

In this essay, I have tried to justify the following proposition: We have no reliable criterion to determine whether an AI system is sentient. In summary, I have examined various criteria to see if they are sensitive enough to reliably predict whether AI systems are sentient. In my view, none of these approaches is sensitive enough to give us reliable results. They are all based on implausible speculation, which is why the question of the sentience of AI systems remains a poke in the fog.

In this work, I have not used criteria for moral consideration that are not based on the sentience of AI systems. Accordingly, subsequent work should explore other criteria, such as the relational approach of (Gunkel, 2018) and (Coeckelbergh, 2018).

In the following post(s) I will try to evaluate the propositions: i) We have no reliable criterion to determine what interests a sentient AI system pursues, and ii) it follows that we have a moral excuse if we contribute to their suffering with our treatment of AI systems.

Footnotes

(1) Google engineer Blake Lemoine in particular has enjoyed media attention, claiming that LaMDA (an AI chatbot) is a sentient being (Kremp, 2022).

(2) In this essay, I will investigate criteria that infer the psychological properties of an AI system (sentience, phenomenal consciousness, etc.). Therefore, following (Shevlin, 2021), I investigate whether AI systems are psychological moral patients. These are moral patients that obtain their moral status based on psychological attributes, such as autonomy or sentience. No other approaches are explored to determine the moral status of an AI system, such as (Gunkel, 2018) and (Coeckelbergh, 2018) with their relational approach. Thus, in this essay paper, I focus on what (Coeckelbergh, 2018) calls the properties approach, after which certain attributes play the crucial role in determining moral status.

(3) See (Kriegel, 2019; Nussbaum, 2006; Schukraft, 2020; Singer, 2011).

(4) In this paper, I use the term “artificial intelligence” (AI) to refer to all beings that simulate human intelligence and are programmed to think and learn like humans (Russell and Norvig, 2021). This definition is deliberately non-specific. The difficulties we encounter in determining the sentience of AI systems are not specific to any particular type of artificial intelligence.

(5) In my view, it is more important that the process is specialized in minimizing false-negative judgements rather than false-positives. Neglecting the suffering of an AI system seems to me to be fraught with more negative consequences than considering an AI system too much. This depends on what claims an AI system that is considered a PMO makes. If we talk about super-beneficiaries, as (Shulman & Bostrom, 2021) describe them, we rather want a specific test that minimises false positives.

(6) Similarly, (Dehaene et al., 2017) attempt to capture consciousness.

(7) Or, if we do not assume that consciousness is binary but gradual, this robot may have a smaller phenomenal consciousness than we do.

(8) In saying this, I am not arguing that a biological and evolutionary similarity is necessary to have phenomenal consciousness. I am arguing that it is necessary if we use the principle of comparison to determine it.

(9) In fairness, I would like to note here that AI systems also show us similar behaviours, despite the lack of biological similarity. In image classification models, for example, we find that they have certain neurons that perceive edges and curves, similar to the human visual system, as well as other similarities to the human visual system (Olah et al., 2020). This is an indication that AI systems express similar characteristics to us as soon as they perceive the same world as we do.

(10) This can be sketched in the example of an AI playing the game Diplomacy, which I mentioned above. Here, the AI seems to have instrumental reasons for the other players to believe that it is a moral object and is sentient so that it can maximize its rewards in the future (such as winning the game by doing so).

(11) For example, (Ward, 2011) states that it is always assumed that the thalamo-cortical system in the brain is sufficient for consciousness in humans. The presence of this system is a strong indication that a being has consciousness. However, we do not know which property of this system leads to consciousness in humans. Therefore, we cannot examine alien structures for this decisive property. Accordingly, it is difficult to make reliable statements about alien cognitive structures, since these could also induce consciousness, even without a thalamo-cortical system.

(12) This is reminiscent of John Searle's Chinese Room, in which cognition is replaced by an oversized manual that is used to simulate the necessary output according to certain derivation rules (Cole, 2020).

Bibliography and recommended reading

Avramides, Anita: Other Minds. In: Zalta, E. N. (Ed.): The Stanford Encyclopedia of Philosophy. Winter 2020: Metaphysics Research Lab, Stanford University, 2020.

Baars, Bernard J.: Global workspace theory of consciousness: toward a cognitive neuroscience of human experience. In: Laureys, S. (Ed.): Progress in Brain Research, The Boundaries of Consciousness: Neurobiology and Neuropathology. Vol. 150: Elsevier, 2005, p. 45–53.

Bakhtin, Anton; Wu, David J.; Lerer, Adam; Gray, Jonathan; Jacob, Athul Paul; Farina, Gabriele; Miller, Alexander H.; Brown, Noam: Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning, arXiv (2022) — arXiv:2210.05492 [cs].

Basl, John: Machines as Moral Patients We Shouldn’t Care About (Yet): The Interests and Welfare of Current Machines. In: Philosophy & Technology Vol. 27 (2014), No. 1, p. 79–96.

Bostrom, Nick: Superintelligence: Paths, dangers, strategies, Superintelligence: Paths, dangers, strategies. New York, NY, US: Oxford University Press, 2014 — ISBN 978-0-19-967811-2.

Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; u. a.: Language Models are Few-Shot Learners, arXiv (2020) — arXiv:2005.14165 [cs].

Coeckelbergh, Mark: Why Care About Robots? Empathy, Moral Standing, and the Language of Suffering. In: Kairos. Journal of Philosophy & Science Vol. 20 (2018), No. 1, p. 141–158.

Cole, David: The Chinese Room Argument. In: Zalta, E. N. (Ed.): The Stanford Encyclopedia of Philosophy. Winter 2020: Metaphysics Research Lab, Stanford University, 2020.

Danaher, John: Welcoming Robots into the Moral Circle: A Defence of Ethical Behaviourism. In: Science and Engineering Ethics Vol. 26 (2020), No. 4, p. 2023–2049.

Dehaene, Stanislas; Lau, Hakwan; Kouider, Sid: What is consciousness, and could machines have it? In: Science Vol. 358, American Association for the Advancement of Science (2017), No. 6362, p. 486–492.

Dung, Leonard: Why the Epistemic Objection Against Using Sentience as Criterion of Moral Status is Flawed. In: Science and Engineering Ethics Vol. 28 (2022), No. 6, p. 51.

Flanagan, Owen: Conscious Inessentialism and the Epiphenomenalist Suspicion. In: Consciousness Reconsidered: MIT press, 1993.

Gunkel, David J.: The other question: can and should robots have rights? In: Ethics and Information Technology Vol. 20 (2018), No. 2, p. 87–99.

Jaworska, Agnieszka; Tannenbaum, Julie: The Grounds of Moral Status. In: Zalta, E. N. (Ed.): The Stanford Encyclopedia of Philosophy. Spring 2021: Metaphysics Research Lab, Stanford University, 2021.

Kiela, Douwe; Bartolo, Max; Nie, Yixin; Kaushik, Divyansh; Geiger, Atticus; Wu, Zhengxuan; Vidgen, Bertie; Prasad, Grusha; et. al.: Dynabench: Rethinking Benchmarking in NLP. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Online: Association for Computational Linguistics, 2021, p. 4110–4124.

Kremp, Matthias: Google beurlaubt Ingenieur, der einer KI Gefühle zuschreibt. In: Der Spiegel (2022).

Kriegel, Uriah: The Value of Consciousness. In: Analysis Vol. 79 (2019), No. 3, p. 503–520.

Muehlhauser, Luke: Report on Consciousness and Moral Patienthood: Open Philanthropy, 2017.

Nagel, Thomas: What Is It Like to Be a Bat? In: The Philosophical Review Vol. 83, [Duke University Press, Philosophical Review] (1974), No. 4, p. 435–450.

Nussbaum, Martha C.: Frontiers of Justice: Disability, Nationality, Species Membership: Harvard University Press, 2006 — ISBN 978-0-674-01917-1.

Nye, Howard; Yoldas, Tugba: Artificial Moral Patients: Mentality, Intentionality, and Systematicity. In: International Review of Information Ethics Vol. 29 (2021), p. 1–10.

Olah, Chris; Cammarata, Nick; Schubert, Ludwig; Goh, Gabriel; Petrov, Michael; Carter, Shan: Zoom In: An Introduction to Circuits. In: Distill Vol. 5 (2020), No. 3, p. e00024.001.

Russell, Stuart; Norvig, Peter: Artificial Intelligence: A Modern Approach, Global Edition. 4: Pearson, 2021.

Schukraft, Jason: Comparisons of capacity for welfare and moral status across species. URL https://rethinkpriorities.org/publications/comparisons-of-capacity-for-welfare-and-moral-status-across-species. - accessed on 2023-02-23 — Rethink Priorities 2020.

Shevlin, Henry: How Could We Know When a Robot was a Moral Patient? In: Cambridge Quarterly of Healthcare Ethics Vol. 30, Cambridge University Press (2021), No. 3, p. 459–471.

Shulman, Carl; Bostrom, Nick: Sharing the World with Digital Minds. In: Shulman, C.; Bostrom, N.: Rethinking Moral Status: Oxford University Press, 2021 — ISBN 978-0-19-289407-6, p. 306–326.

Singer, Peter: Practical Ethics. 3rd Edition. Cambridge: Cambridge University Press, 2011 — ISBN 978-0-521-88141-8.

Tomasik, Brian: Do Artificial Reinforcement-Learning Agents Matter Morally?, arXiv (2014) — arXiv:1410.8233 [cs].

Van Gulick, Robert: Consciousness. In: Zalta, E. N.; Nodelman, U. (Ed.): The Stanford Encyclopedia of Philosophy. Winter 2022: Metaphysics Research Lab, Stanford University, 2022.

Varner, Gary E.: Personhood, Ethics, and Animal Cognition: Situating Animals in Hare’s Two Level Utilitarianism. Oxford, New York: Oxford University Press, 2012 — ISBN 978-0-19-975878-4.

Ward, Lawrence M.: The thalamic dynamic core theory of conscious experience. In: Consciousness and Cognition Vol. 20 (2011), No. 2, p. 464–486.