Searching for Emergence

I have a longstanding interest in the concept of emergence as a way of explaining a wide range of human ideas and the natural world. We have this incredible algorithm of evolutionary change that creates novel life forms. We have, according to mainstream materialist accounts of philosophy of mind, a consciousness that may have a unique ontology (what really exists) of subjective experiencers and qualia and intentionality, but that is also somehow emergent from the meat of the brain (or supervenes or is an epiphenomenon, etc. etc.) That emergence may be weak or strong in various accounts, with strong meaning something like the idea that a new thing is added to the ontology while weak meaning something like we just don’t know enough yet to find the reduction of the concept to its underlying causal components. If we did, then it is not really something new in this grammar of ontological necessity.

There is also the problem of computational irreducibility (CI) that has been championed by Wolfram. In CI, there are classes of computations that result in outcomes that cannot be predicted by any simpler algorithm. This seems to open the door to a strong concept of emergence: we have to run the machine to get the outcome; there is no possibility (in theory!) of reducing the outcome to any lesser approximation. I’ve brought this up as a defeater of the Simulation Hypothesis, suggesting that the complexity of a simulation is irreducible from the universe as we see it (assuming perfect coherence in the limit).

There is also a dual to this idea in algorithmic information theory (AIT) that is worth exploring. In AIT, it is uncomputable to find the shortest Turing Machine capable of accepting a given symbol sequence.… Read the rest

Uncertainty, Murder, and Emergent Free Will

I’ll jump directly into my main argument without stating more than the basic premise that if determinism holds all our actions cannot be otherwise and there is no “libertarian” free will.

Let’s construct a robot (R) that has a decision-making apparatus (DM), some sensors (S) for collecting impressions about our world, and a memory (M) of all those impressions and past decisions of DM. DM is pretty much an IF-THEN arrangement but has a unique feature. It has subroutines that generate new IF-THENs by taking existing rules and randomly recombining them together with variation. This might be done by simply snipping apart at logical operations (blue AND wings AND small => bluejay at 75% can be pulled apart into “blue AND wings” and “wings AND small” and those two combined with other such rules). This generative subroutine (GS) then scores the novel IF-THENs by comparing them to the recorded history contained in M as well as current sensory impressions and keeps the new rule that scores best or the top few if they score closely. The scoring methodology might include a combination of coverage and fidelity to the impressions and/or recalled action/impressions.

Now this is all quite deterministic. I mentioned randomness but we can produce pseudo-random number generators that are good enough or even rely on a small electronic circuit that amplifies thermodynamic noise to get something “truly” random. But really we could just substitute an algorithm that checks every possible reorganization and scores them all and shelve the randomness component, alleviating any concerns that we are smuggling in randomness for our later construct of free agency.

Now let’s add a rule to DM that when R perceives it has been treated unfairly it might murder the human being who treated it that way.… Read the rest

Inferred Modular Superparrots

The buzz about ChatGPT and related efforts has been surprisingly resistant to the standard deflationary pressure of the Gartner hype cycle. Quantum computing definitely fizzled but appears to be moving towards the plateau of productivity with recent expansions of the number of practical qubits available by IBM and Origin in China, as well as additional government funding out of national security interests and fears. But ChatGPT attracted more sustained attention because people can play with it easily without needing to understand something like Shor’s algorithm for factoring integers. Instead, you just feed it a prompt and are amazed that it writes so well. And related image generators are delightful (as above) and may represent a true displacement of creative professionals even at this early stage, with video hallucinators evolving rapidly too.

But are Large Language Models (LLMs) like ChatGPT doing much more than stitching together recorded fragments of texts ingested from an internet-scale corpus of text? Are they inferring patterns that are in any way beyond just being stochastic parrots? And why would scaling up a system result in qualitative new capabilities, if there are any at all?

Some new work covered in Quanta Magazine has some intriguing suggestions that there is a bit more going on in LLMs, although the subtitle contains the word “understanding” that I think is premature. At heart is the idea that as networks scale up given ordering rules that are not highly uniform or correlated they tend to break up into collections of subnetworks that are distinct (substitute “graphs” for networks if you are a specialist). The theory, then, is that the ingest of sufficient magnitudes of text into a sufficiently large network and the error-minimization involved in tuning that network to match output to input also segregates groupings that the Quanta author and researchers at Princeton and DeepMind refer to as skills.… Read the rest