Uncertainty, Murder, and Emergent Free Will

I’ll jump directly into my main argument without stating more than the basic premise that if determinism holds all our actions cannot be otherwise and there is no “libertarian” free will.

Let’s construct a robot (R) that has a decision-making apparatus (DM), some sensors (S) for collecting impressions about our world, and a memory (M) of all those impressions and past decisions of DM. DM is pretty much an IF-THEN arrangement but has a unique feature. It has subroutines that generate new IF-THENs by taking existing rules and randomly recombining them together with variation. This might be done by simply snipping apart at logical operations (blue AND wings AND small => bluejay at 75% can be pulled apart into “blue AND wings” and “wings AND small” and those two combined with other such rules). This generative subroutine (GS) then scores the novel IF-THENs by comparing them to the recorded history contained in M as well as current sensory impressions and keeps the new rule that scores best or the top few if they score closely. The scoring methodology might include a combination of coverage and fidelity to the impressions and/or recalled action/impressions.

Now this is all quite deterministic. I mentioned randomness but we can produce pseudo-random number generators that are good enough or even rely on a small electronic circuit that amplifies thermodynamic noise to get something “truly” random. But really we could just substitute an algorithm that checks every possible reorganization and scores them all and shelve the randomness component, alleviating any concerns that we are smuggling in randomness for our later construct of free agency.

Now let’s add a rule to DM that when R perceives it has been treated unfairly it might murder the human being who treated it that way.… Read the rest

Inferred Modular Superparrots

The buzz about ChatGPT and related efforts has been surprisingly resistant to the standard deflationary pressure of the Gartner hype cycle. Quantum computing definitely fizzled but appears to be moving towards the plateau of productivity with recent expansions of the number of practical qubits available by IBM and Origin in China, as well as additional government funding out of national security interests and fears. But ChatGPT attracted more sustained attention because people can play with it easily without needing to understand something like Shor’s algorithm for factoring integers. Instead, you just feed it a prompt and are amazed that it writes so well. And related image generators are delightful (as above) and may represent a true displacement of creative professionals even at this early stage, with video hallucinators evolving rapidly too.

But are Large Language Models (LLMs) like ChatGPT doing much more than stitching together recorded fragments of texts ingested from an internet-scale corpus of text? Are they inferring patterns that are in any way beyond just being stochastic parrots? And why would scaling up a system result in qualitative new capabilities, if there are any at all?

Some new work covered in Quanta Magazine has some intriguing suggestions that there is a bit more going on in LLMs, although the subtitle contains the word “understanding” that I think is premature. At heart is the idea that as networks scale up given ordering rules that are not highly uniform or correlated they tend to break up into collections of subnetworks that are distinct (substitute “graphs” for networks if you are a specialist). The theory, then, is that the ingest of sufficient magnitudes of text into a sufficiently large network and the error-minimization involved in tuning that network to match output to input also segregates groupings that the Quanta author and researchers at Princeton and DeepMind refer to as skills.… Read the rest