Sentience is Physical

Sentience is all the rage these days. With large language models (LLMs) based on deep learning neural networks, question-answering behavior of these systems takes on curious approximations to talking with a smart person. Recently a member of Google’s AI team was fired after declaring one of their systems sentient. His offense? Violating public disclosure rules. I and many others who have a firm understanding of how these systems work—by predicting next words from previous productions crossed with the question token stream—are quick to dismiss the claims of sentience. But what does sentience really amount to and how can we determine if a machine becomes sentient?

Note that there are those who differentiate sentience (able to have feelings), from sapience (able to have thoughts), and consciousness (some private, subjective phenomenal sense of self). I am willing to blend them together a bit since the topic here isn’t narrowly trying to address the ethics of animal treatment, for example, where the distinction can be useful.

First we have the “imitation game” Turing test-style approach to the question of how we might ever determine if a machine becomes sentient. If a remote machine can fool a human into believing it is a person, it must be as intelligent as a person and therefore sentient like we presume of people. But this is a limited goal line. If the interaction is only over a limited domain like solving your cable internet installation problems, we don’t think of that as a sentient machine. Even against a larger domain of open-ended question and answering, if the human doesn’t hit upon a revealing kind of error that a machine might make that a human would not, we remain unconvinced that the target is sentient.… Read the rest

Wordle and the Hard Problem of Philosophy

I occasionally do Wordles at the New York Times. If you are not familiar, the game is very simple. You have six chances to guess a five-letter word. When you make a guess, letters that are in the correct position turn green. Letters that are in the word but in the wrong position turn yellow. The mental process for solving them is best optimized by choosing a word initially that has high-frequency English letters, like “notes,” and then proceeding from there. At some point in the guessing process, one is confronted with anchoring known letters and trying to remember words that might fit the sequence. There is a handy virtual keyboard displayed below the word matrix that shows you the letters in black, yellow, green, and gray that you have tried, that are required, that are fit to position, and that remain untested, respectively. After a bit, you start to apply little algorithms and exclusionary rules to the process: What if I anchor an S at the beginning? There are no five-letter words that end in “yi” in English, etc. There is a feeling of working through these mental strategies and even a feeling of green and yellow as signposts along the way.

I decided this morning to write the simplest one-line Wordle helper I could and solved the puzzle in two guesses:

Sorry for the spoiler if you haven’t gotten to it yet! Here’s what I needed to do the job: a five letter word list for English and a word frequency list for English. I could have derived the first from the second but found the first first, here. The second required I log into Kaggle to get a good CSV searchable list.… Read the rest

We Are Weak Chaos

Recent work in deep learning networks has been largely driven by the capacity of modern computing systems to compute gradient descent over very large networks. We use gaming cards with GPUs that are great for parallel processing to perform the matrix multiplications and summations that are the primitive operations central to artificial neural network formalisms. Conceptually, another primary advance is the pre-training of networks as autocorrelators that helps with smoothing out later “fine tuning” training programs over other data. There are some additional contributions that are notable in impact and that reintroduce the rather old idea of recurrent neural networks, networks with outputs attached back to inputs that create resonant kinds of running states within the network. The original motivation of such architectures was to emulate the vast interconnectivity of real neural systems and to capture a more temporal appreciation of data where past states affect ongoing processing, rather than a pure feed-through architecture. Neural networks are already nonlinear systems, so adding recurrence just ups the complexity of trying to figure out how to train them. Treating them as black boxes and using evolutionary algorithms was fashionable for me in the 90s, though the computing capabilities just weren’t up for anything other than small systems, as I found out when chastised for overusing a Cray at Los Alamos.

But does any of this have anything to do with real brain systems? Perhaps. Here’s Toker, et. al. “Consciousness is supported by near-critical slow cortical electrodynamics,” in Proceedings of the National Academy of Sciences (with the unenviable acronym PNAS). The researchers and clinicians studied the electrical activity of macaque and human brains in a wide variety of states: epileptics undergoing seizures, macaque monkeys sleeping, people on LSD, those under the effects of anesthesia, and people with disorders of consciousness.… Read the rest

Triangulation Machinery, Poetry, and Politics

I was reading Muriel Rukeyser‘s poetry and marveling at some of the lucid yet novel constructions she employs. I was trying to avoid the grueling work of comparing and contrasting Biden’s speech on the anniversary of January 6th, 2021 with the responses from various Republican defenders of Trump. Both pulled into focus the effect of semantic and pragmatic framing as part of the poetic and political processes, respectively. Sorry, Muriel, I just compared your work to the slow boil of democracy.

Reaching in interlaced gods, animals, and men.
There is no background. The figures hold their peace
In a web of movement. There is no frustration,
Every gesture is taken, everything yields connections.

There is a theory about how language works that I’ve discussed here before. In this theory, from Donald Davidson primarily, the meaning of words and phrases are tied directly to a shared interrogation of what each person is trying to convey. Imagine a child observing a dog and a parent says “dog” and is fairly consistent with that usage across several different breeds that are presented to the child. The child may overuse the word, calling a cat a dog at some point, at which point the parent corrects the child with “cat” and the child proceeds along through this interrogatory process, triangulating in on the meaning of dog versus cat. Triangulation is Davidson’s term, reflecting three parties: two people discussing a thing or idea. In the case of human children, we also know that there are some innate preferences the child will apply during the triangulation process, like preferring “whole object” semantics to atomized ones, and assuming different words mean different things even when applied to the same object: so “canine” and “dog” must refer to the same object in slightly different ways since they are differing words, and indeed they do: dog IS-A canine but not vice-versa.… Read the rest

Flooding the Mystery Zone with Cynicism

The Mystery of the FoxI just finished planting one of my two urban garden plots here in Southern New Mexico. The circles had been left unattended and later covered with weed-control fabric that I topped with rock a few years ago when I visited from our Arizona home and discovered a vexing and disturbing collection of items buried in the soil. There was a child’s ball, a partially melted white candle, some marbles, a variety of small bones and strange animal remains, indeterminate masses of red and brown, unusual feces, and large pork chop bones. A shrine to strange, ancient deities? The remains of an ancient civilization? Our security camera coverage and the gates and fencing ruled out human activity. So we were left with wild animals, specifically gray foxes with long bushy tails that appear integrated into our little downtown community. We see them on the cameras early in the morning hours, typically, and they do some rather odd things, so the notion that they were collecting interesting items and burying them did not seem unreasonable. We also observed one fox flipping a piece of torn paper plate in the air in front of an unimpressed cat crouching nearby. Foxes will sometimes do similar jumping behavior as a method for mesmerizing their prey, but why bury a melted candle? Perhaps it smelled just enough like food that the fox thought it might come in handy during lean times later. And the child’s toy ball? Plastic odors might also resemble food. Maybe.

The New Mexico foxes, skunks, raccoons, and, I’m informed, some formerly pet coatimundi that wander in the area (but we’ve never seen), as well as the javelina, coyotes, deer, bobcats, and foxes around our Arizona forest home, are certainly influential in my Tusker Long project that tries to tackle an alien world where the worker slave animals have broken from their chains of servitude and simplicity to dominate society and come to grips with their own limits, prejudices, and historical animosities (perfectly wrong word, that).… Read the rest

Distributed Contexts in the Language Game

The meaning of words and phrases can be a bit hard to pin down. Indeed, the meaning of meaning itself is problematical. I can point to a dictionary and say, well, there is where we keep the meanings of things, but that is just a record of the way in which we use the language. I’m personally fond of a kind of philosophical perspective on this matter of meaning that relies on a form of holism. That is, words and their meanings are defined by our usages of them, our historical interactions with them in different contexts, and subtle distinctive cues that illuminate how words differ and compare. Often, but not always, the words are tied to things in the world, as well, and therefore have a fastness that resists distortions and distinctions.

This is, of course, a critical area of inquiry when trying to create intelligent machines that deal with language. How do we imbue the system with meaning, represent it within the machine, and apply it to novel problems that show intelligent behavior? In approaching the problem, we must therefore be achieving some semblance of intelligence in a fairly rigorous way since we are simulating it with logical steps.

The history of philosophical and linguistic interest in these topics is fascinating, ranging from Wittgenstein’s notion of a language game that builds up rules of use to Firth’s expansion to formalization of collocation of words as critical to meaning. In artificial intelligence, this concept of collocation has been expanded further to include interchangeability of contexts. Thus, boat and ship occur in more similar contexts than boat and bank.

A general approach to acquiring these contexts is based on the idea of dimensionality reduction in various forms.… Read the rest

Intelligent Borrowing

There has been a continuous bleed of biological, philosophical, linguistic, and psychological concepts into computer science since the 1950s. Artificial neural networks were inspired by real ones. Simulated evolution was designed around metaphorical patterns of natural evolution. Philosophical, linguistic, and psychological ideas transferred as knowledge representation and grammars, both natural and formal.

Since computer science is a uniquely synthetic kind of science and not quite a natural one, borrowing and applying metaphors seems to be part of the normal mode of advancement in this field. There is a purely mathematical component to the field in the fundamental questions around classes of algorithms and what is computable, but there are also highly synthetic issues that arise from architectures that are contingent on physical realizations. Finally, the application to simulating intelligent behavior relies largely on three separate modes of operation:

  1. Hypothesize about how intelligent beings perform such tasks
  2. Import metaphors based on those hypotheses
  3. Given initial success, use considerations of statistical features and their mappings to improve on the imported metaphors (and, rarely, improve with additional biological insights)

So, for instance, we import a simplified model of neural networks as connected sets of weights representing some kind of variable activation or inhibition potentials combined with sudden synaptic firing. Abstractly we already have an interesting kind of transfer function that takes a set of input variables and has a nonlinear mapping to the output variables. It’s interesting because being nonlinear means it can potentially compute very difficult relationships between the input and output.

But we see limitations, immediately, and these are observed in the history of the field. For instance, if you just have a single layer of these simulated neurons, the system isn’t fundamentally complex enough to compute any complex functions, so we add a few layers and then more and more.… Read the rest

The Twin Earth Dissonance Conspiracy

I came of age with some of the mid-to-late 20th century literature that took conspiracies as truss work for calculated paranoia, from Pynchon’s Gravity’s Rainbow to Philip K. Dick’s identity shuffling, and on to the obscurely psychedelic Illuminati books by Robert Shea and Robert Anton Wilson. They were undoubtedly influenced by the dirty tricks and mind control fantasies and realities of the Cold War, from thallium and LSD poisoning plots against Fidel Castro to the Manchurian Candidate and John Birchers; from Dr. Strangelove to ratfucking in the Nixon-era Republican Party.

The fiction paralleled and mimicked those realities but it was also infused with a kind of magical realism where the ideas permeated through the characters in a nexus of paranoia and fantasy. The reader was admitted to eccentric ways of structuring the history of the world and the motives of unseen forces acting through organizations, governments, and powerful people.

While endlessly fun, the fictional forms were also an inoculation: no mundane conspiracy could possibly capture that pulse of inside knowledge of a mystic firmament of lies and outlandish goals canopied above our earth-chained heads.

But here I am again, though much less amused and more fearful.

I think I read ten different reporting and opinion pieces today on the topic of Marjorie Taylor Greene, the shock-curiosity of the day who amplified QAnon, Jewish space lasers, political assassination fantasies, and likely a range of yet-to-be-discovered subjects of scorn and ridicule. Most analysts agree that such fantastical and angry ideas are methods for manipulating gullible people. They are tools for the acquisition of power over others.

The whole project feels like an alternative reality so late in America’s evolution, like we’ve transitioned to a Counter-Earth or Bizarro Htrae or Nabakov’s AntiTerra.… Read the rest

Type 2 Modular Cognitive Responsibility for a New Year

Brain on QI’m rebooting a startup that I had set aside a year ago. I’ve had some recent research and development advances that make it again seem worth pursuing. Specifically, the improved approach uses a deep learning decision-making filter of sorts to select among natural language generators based on characteristics of the interlocutor’s queries. The channeling to the best generator uses word and phrase cues, while the generators themselves are a novel deep learning framework that integrates ontologies about specific domain areas or motives of the chatbot. Some of the response systems involve more training than others. They are deeper and have subtle goals in responding to the query. Others are less nuanced and just engage in non-performative casual speech.

In social and cognitive psychology there is some recent research that bears a resemblance to this and also is related to contemporary politics and society. Well, cognitive modularity at the simplest is one area of similarity. But within the scope of that is the Type 1/Type 2 distinction, or “fast” versus “slow” thinking. In this “dual process” framework decision-making may be guided by intuitive Type 1 thinking that relates to more primitive, older evolutionary modules of the mind. Type 1 evolved to help solve survival dilemmas that require quick resolution. But inferential reasoning developed more slowly and apparently fairly late for us, with the impact of modern education strengthening the ability of these Type 2 decision processes to override the intuitive Type 1 decisions.

These insights have been applied in remarkably interesting ways in trying to understand political ideologies, moral choices, and even religious identity. For instance, there is some evidence that conservative political leanings correlates more with Type 1 processes.… Read the rest

One Shot, Few Shot, Radical Shot

Exunoplura is back up after a sad excursion through the challenges of hosting providers. To be blunt, they mostly suck. Between systems that just don’t work right (SSL certificate provisioning in this case) and bad to counterproductive support experiences, it’s enough to make one want to host it oneself. But hosting is mostly, as they say of war, long boring periods punctuated by moments of terror as things go frustratingly sideways. But we are back up again after two hosting provider side-trips!

Honestly, I’d like to see an AI agent effectively navigate through these technological challenges. Where even human performance is fleeting and imperfect, the notion that an AI could learn how to deal with the uncertain corners of the process strikes me as currently unthinkable. But there are some interesting recent developments worth noting and discussing in the journey towards what is named “general AI” or a framework that is as flexible as people can be, rather than narrowly tied to a specific task like visually inspecting welds or answering a few questions about weather, music, and so forth.

First, there is the work by the OpenAI folks on massive language models being tested against one-shot or few-shot learning problems. In each of these learning problems, the number of presentations of the training data cases is limited, rather than presenting huge numbers of exemplars and “fine tuning” the response of the model. What is a language model? Well, it varies across different approaches, but typically is a weighted context of words of varying length, with the weights reflecting the probabilities of those words in those contexts over a massive collection of text corpora. For the OpenAI model, GPT-3, the total number of parameters (words/contexts and their counts) is an astonishing 175 billion using 45 Tb of text to train the model.… Read the rest