Sentience is all the rage these days. With large language models (LLMs) based on deep learning neural networks, question-answering behavior of these systems takes on curious approximations to talking with a smart person. Recently a member of Google’s AI team was fired after declaring one of their systems sentient. His offense? Violating public disclosure rules. I and many others who have a firm understanding of how these systems work—by predicting next words from previous productions crossed with the question token stream—are quick to dismiss the claims of sentience. But what does sentience really amount to and how can we determine if a machine becomes sentient?
Note that there are those who differentiate sentience (able to have feelings), from sapience (able to have thoughts), and consciousness (some private, subjective phenomenal sense of self). I am willing to blend them together a bit since the topic here isn’t narrowly trying to address the ethics of animal treatment, for example, where the distinction can be useful.
First we have the “imitation game” Turing test-style approach to the question of how we might ever determine if a machine becomes sentient. If a remote machine can fool a human into believing it is a person, it must be as intelligent as a person and therefore sentient like we presume of people. But this is a limited goal line. If the interaction is only over a limited domain like solving your cable internet installation problems, we don’t think of that as a sentient machine. Even against a larger domain of open-ended question and answering, if the human doesn’t hit upon a revealing kind of error that a machine might make that a human would not, we remain unconvinced that the target is sentient.
Historically, of course, we have never had to worry about this issue. Human beings are sentient and other animals are gradations of intelligent, but not like human beings are intelligent. Even if we try to interact with a person and they don’t come across as particularly responsive, we still assume they are sentient. Maybe they don’t speak the language or are learning impaired. Maybe they are just a sullen teenager.
We can come back to the “hard problem of consciousness” but with the additional caveat that we don’t understand what the mechanisms of consciousness actually consist of. A deep learning neural network (DLNN) certainly sounds promising since it has “neural” in its name, but even that should be considered suspect. The model of neurons that is actually used is very far removed from the looped feedback structures combined with rhythmic patterns of firing and complex neurotransmitter modulations that we see in biological systems. Indeed, there were arguments in the machine learning community dating to the 1960s that artificial neural networks were ridiculously hard ways to solve simple problems. Solving scalability in training for these networks is perhaps the critical achievement that underlies success of LLMs and certain image/video processing tasks.
I’ve previously suggested that the existence of a problem of consciousness may be a semantic error using ideas from Hacker and others. But even if the terminology and numinous ruminations are misplaced, we still need a way forward to understanding what sentience and consciousness really amount to. A standard way of conceptualizing these ideas is that we have modules of our brain and cognition, built on our neural systems, and that these modules are monitoring or perceiving each other. This seems an obvious materialist approach to the question of sentience. Animals demonstrate planning and memory, so they clearly build mental models that allow for effective survival. The range and scale of these models vary based on brain size and other factors involving fitness in their ecosystems. There doesn’t appear to be anything different about human brain structure or capabilities except that we are more self-aware, more communicative, and rise to the level of calling ourselves sentient. Whether other creatures deserve that appellation is a matter of debate.
What do these models look like?
The standard approach is to simply consider the brain as computing a model of its contents and states of attention to internal and external stimuli. This is what awareness amounts to, and what we can label as consciousness or sentience. There are some testable ideas built into models like attention schema theory, which is always helpful, and there are really no competing ways of addressing the problem of consciousness in a methodological materialist framework. Well, there are some exotic candidates like Penrose-Hameroff’s Orch-OR quantum ideas, but they really don’t address how attention and sentience come about. They are instead just asking whether quantum phenomena might be a substrate of whatever is going on in our heads (and some new experiments suggest perhaps no).
There are attention mechanisms in some LLMs. In fact, the mechanism of attention in neural networks was designed to solve a basic but somewhat abstract problem associated with networks. When a stimulus is presented to a classic ANN, the history of previous presentations is only apparent in the training effects on weights. There is no state that reflects the influence of, for instance, previous words on the current word. There are several ways to remedy this. For instance, the network can present groups of words together at the same time and roll that group forward from presentation to presentation. Or there can be an attention mechanism that maintains a state from the last presentation within the network itself.
Thus we see the beginning of something similar to the role of attention and awareness in these recent brain models, and the modularity of DLNNs means that this schema can be expanded. Perhaps the missing piece, however, is that LLMs lack something that may never be inferred from predicting strings of text: the physicality of human evolution. The metaphors we use are often physical (“the top of my head”). Our sense of separation from others and the environmental signals that we receive guides us in how we feel awareness. The mental machinery built from play and physical interaction is the means for understanding folk psychology and naive physics.
Ultimately, these schematizations of body and awareness might be the missing and essential ingredient in order for a general intelligence to begin to approach what we expect from something that is sentient. Linguistic prediction, even with attention mechanisms, may never be capable of modeling the physical nature of the events described in the text.