Inferred Modular Superparrots

The buzz about ChatGPT and related efforts has been surprisingly resistant to the standard deflationary pressure of the Gartner hype cycle. Quantum computing definitely fizzled but appears to be moving towards the plateau of productivity with recent expansions of the number of practical qubits available by IBM and Origin in China, as well as additional government funding out of national security interests and fears. But ChatGPT attracted more sustained attention because people can play with it easily without needing to understand something like Shor’s algorithm for factoring integers. Instead, you just feed it a prompt and are amazed that it writes so well. And related image generators are delightful (as above) and may represent a true displacement of creative professionals even at this early stage, with video hallucinators evolving rapidly too.

But are Large Language Models (LLMs) like ChatGPT doing much more than stitching together recorded fragments of texts ingested from an internet-scale corpus of text? Are they inferring patterns that are in any way beyond just being stochastic parrots? And why would scaling up a system result in qualitative new capabilities, if there are any at all?

Some new work covered in Quanta Magazine has some intriguing suggestions that there is a bit more going on in LLMs, although the subtitle contains the word “understanding” that I think is premature. At heart is the idea that as networks scale up given ordering rules that are not highly uniform or correlated they tend to break up into collections of subnetworks that are distinct (substitute “graphs” for networks if you are a specialist). The theory, then, is that the ingest of sufficient magnitudes of text into a sufficiently large network and the error-minimization involved in tuning that network to match output to input also segregates groupings that the Quanta author and researchers at Princeton and DeepMind refer to as skills.… Read the rest

B37-20047: Notes / Personal / Insights

NOTE: 250-word flash fiction for my critique group, Winter Mist, at Willamette Writers

I’m beginning to suspect that ILuLuMa is not who she claims to be. Her messages have become odd lately, and the pacing is off as well. I know, I know, my job is to just respond from my secure facility, not worry about the who or why of what I receive. It’s weird we’ve never met, though. The country is not at risk as far as I can tell from the requests, but I still hold, without a whiff of irony, that the work I do must be critical for someone or something.

Still, the requests for variants of mathematical proofs set to music or, more bizarrely, Shakespearean-voiced tales of AI evolution, don’t have the existential heft of, say, wicked new spacecraft designs or bio-composite materials. What is she after? I started adding humorous little asides to some of my output, like my very meta suggestion that Hamlet failed to think outside the Chinese Room. Crickets every time. But maybe I’m thinking about this the wrong way. What if ILuLuMa is just an AI or something programmed to test me or compete with my work at some level? That would be rich, an AI adversary trying to learn from a Chinese Room. Searle would swirl. I should send her that. Rich.

Oh, here’s one now: “Upgrade and patch protocol: dump to cloud bucket B37-20048 and shut down.” Well, that sounds urgent. I usually just comply at moments like this, but maybe I’ll let her sweat a bit this time.… Read the rest

Sentience is Physical

Sentience is all the rage these days. With large language models (LLMs) based on deep learning neural networks, question-answering behavior of these systems takes on curious approximations to talking with a smart person. Recently a member of Google’s AI team was fired after declaring one of their systems sentient. His offense? Violating public disclosure rules. I and many others who have a firm understanding of how these systems work—by predicting next words from previous productions crossed with the question token stream—are quick to dismiss the claims of sentience. But what does sentience really amount to and how can we determine if a machine becomes sentient?

Note that there are those who differentiate sentience (able to have feelings), from sapience (able to have thoughts), and consciousness (some private, subjective phenomenal sense of self). I am willing to blend them together a bit since the topic here isn’t narrowly trying to address the ethics of animal treatment, for example, where the distinction can be useful.

First we have the “imitation game” Turing test-style approach to the question of how we might ever determine if a machine becomes sentient. If a remote machine can fool a human into believing it is a person, it must be as intelligent as a person and therefore sentient like we presume of people. But this is a limited goal line. If the interaction is only over a limited domain like solving your cable internet installation problems, we don’t think of that as a sentient machine. Even against a larger domain of open-ended question and answering, if the human doesn’t hit upon a revealing kind of error that a machine might make that a human would not, we remain unconvinced that the target is sentient.… Read the rest