The Twin Earth Dissonance Conspiracy

I came of age with some of the mid-to-late 20th century literature that took conspiracies as truss work for calculated paranoia, from Pynchon’s Gravity’s Rainbow to Philip K. Dick’s identity shuffling, and on to the obscurely psychedelic Illuminati books by Robert Shea and Robert Anton Wilson. They were undoubtedly influenced by the dirty tricks and mind control fantasies and realities of the Cold War, from thallium and LSD poisoning plots against Fidel Castro to the Manchurian Candidate and John Birchers; from Dr. Strangelove to ratfucking in the Nixon-era Republican Party.

The fiction paralleled and mimicked those realities but it was also infused with a kind of magical realism where the ideas permeated through the characters in a nexus of paranoia and fantasy. The reader was admitted to eccentric ways of structuring the history of the world and the motives of unseen forces acting through organizations, governments, and powerful people.

While endlessly fun, the fictional forms were also an inoculation: no mundane conspiracy could possibly capture that pulse of inside knowledge of a mystic firmament of lies and outlandish goals canopied above our earth-chained heads.

But here I am again, though much less amused and more fearful.

I think I read ten different reporting and opinion pieces today on the topic of Marjorie Taylor Greene, the shock-curiosity of the day who amplified QAnon, Jewish space lasers, political assassination fantasies, and likely a range of yet-to-be-discovered subjects of scorn and ridicule. Most analysts agree that such fantastical and angry ideas are methods for manipulating gullible people. They are tools for the acquisition of power over others.

The whole project feels like an alternative reality so late in America’s evolution, like we’ve transitioned to a Counter-Earth or Bizarro Htrae or Nabakov’s AntiTerra.… Read the rest

The Pregnant Machinery of Metaphor

Sylvia Plath has some thoughts about pregnancy in “Metaphors”:

I’m a riddle in nine syllables,
An elephant, a ponderous house,
A melon strolling on two tendrils.
O red fruit, ivory, fine timbers!
This loaf’s big with its yeasty rising.
Money’s new-minted in this fat purse.
I’m a means, a stage, a cow in calf.
I’ve eaten a bag of green apples,
Boarded the train there’s no getting off.

It seems, at first blush, that metaphors have some creative substitutive similarity to the concept they are replacing. We can imagine Plath laboring over this child in nine lines, fitting the pieces together, cutting out syllables like dangling umbilicals, finding each part in a bulging conception, until it was finally born, alive and kicking.

OK, sorry, I’ve gone too far, fallen over a cliff, tumbled down through a ravine, dropped into the foaming sea, where I now bob, like your uncle. Stop that!

Let’s assume that much human creativity occurs through a process of metaphor or analogy making. This certainly seems to be the case in aspects of physics when dealing with difficult to understand new realms of micro- and macroscopic phenomena, as I’ve noted here. Some creative fields claim a similar basis for their work, with poetry being explicit about the hidden or secret meaning of poems. Moreover, I will also suppose that a similar system operates in terms of creating networks of semantics by which we understand the meaning of words and their relationships to phenomena external to us, as well as our own ideas. In other words, semantics are a creative puzzle for us.

What do we know about this system and how can we create abstract machines that implement aspects of it?… Read the rest

One Shot, Few Shot, Radical Shot

Exunoplura is back up after a sad excursion through the challenges of hosting providers. To be blunt, they mostly suck. Between systems that just don’t work right (SSL certificate provisioning in this case) and bad to counterproductive support experiences, it’s enough to make one want to host it oneself. But hosting is mostly, as they say of war, long boring periods punctuated by moments of terror as things go frustratingly sideways. But we are back up again after two hosting provider side-trips!

Honestly, I’d like to see an AI agent effectively navigate through these technological challenges. Where even human performance is fleeting and imperfect, the notion that an AI could learn how to deal with the uncertain corners of the process strikes me as currently unthinkable. But there are some interesting recent developments worth noting and discussing in the journey towards what is named “general AI” or a framework that is as flexible as people can be, rather than narrowly tied to a specific task like visually inspecting welds or answering a few questions about weather, music, and so forth.

First, there is the work by the OpenAI folks on massive language models being tested against one-shot or few-shot learning problems. In each of these learning problems, the number of presentations of the training data cases is limited, rather than presenting huge numbers of exemplars and “fine tuning” the response of the model. What is a language model? Well, it varies across different approaches, but typically is a weighted context of words of varying length, with the weights reflecting the probabilities of those words in those contexts over a massive collection of text corpora. For the OpenAI model, GPT-3, the total number of parameters (words/contexts and their counts) is an astonishing 175 billion using 45 Tb of text to train the model.… Read the rest

Forever Uncanny

Quanta has a fair round up of recent advances in deep learning. Most interesting is the recent performance on natural language understanding tests that are close to or exceed mean human performance. Inevitably, John Searle’s Chinese Room argument is brought up, though the author of the Quanta article suggests that inferring the Chinese translational rule book from the data itself is slightly different from the original thought experiment. In the Chinese Room there is a person who knows no Chinese but has a collection of translational reference books. She receives texts through a slot and dutifully looks up the translation of the text and passes out the result. “Is this intelligence?” is the question and it serves as a challenge to the Strong AI hypothesis. With statistical machine translation methods (and their alternative mechanistic implementation, deep learning), the rule books have been inferred by looking at translated texts (“parallel” texts as we say in the field). By looking at a large enough corpus of parallel texts, greater coverage of translated variants is achieved as well as some inference of pragmatic issues in translation and corner cases.

As a practical matter, it should be noted that modern, professional translators often use translation memory systems that contain idiomatic—or just challenging—phrases that they can reference when translating new texts. The understanding resides in the original translator’s head, we suppose, and in the correct application of the rule to the new text by checking for applicability according to, well, some other criteria that the translator brings to bear on the task.

In the General Language Understand Evaluation (GLUE) tests described in the Quanta article, the systems are inferring how to answer Wh-style queries (who, what, where, when, and how) as well as identify similar texts.… Read the rest

Bereitschaftspotential and the Rehabilitation of Free Will

The question of whether we, as people, have free will or not is both abstract and occasionally deeply relevant. We certainly act as if we have something like libertarian free will, and we have built entire systems of justice around this idea, where people are responsible for choices they make that result in harms to others. But that may be somewhat illusory for several reasons. First, if we take a hard deterministic view of the universe as a clockwork-like collection of physical interactions, our wills are just a mindless outcome of a calculation of sorts, driven by a wetware calculator with a state completely determined by molecular history. Second, there has been, until very recently, some experimental evidence that our decision-making occurs before we achieve a conscious realization of the decision itself.

But this latter claim appears to be without merit, as reported in this Atlantic article. Instead, what was previously believed to be signals of brain activity that were related to choice (Bereitschaftspotential) may just be associated with general waves of neural activity. The new experimental evidence puts the timing of action in line with conscious awareness of the decision. More experimental work is needed—as always—but the tentative result suggests a more tightly coupled pairing of conscious awareness with decision making.

Indeed, the results of this newer experimental result gets closer to my suggested model of how modular systems combined with perceptual and environmental uncertainty can combine to produce what is effectively free will (or at least a functional model for a compatibilist position). Jettisoning the Chaitin-Kolmogorov complexity part of that argument and just focusing on the minimal requirements for decision making in the face of uncertainty, we know we need a thresholding apparatus that fires various responses given a multivariate statistical topology.… Read the rest

Deep Learning with Quantum Decoherence

Getting back to metaphors in science, Wojciech Zurek’s so-called Quantum Darwinism is in the news due to a series of experimental tests. In Quantum Darwinism (QD), the collapse of the wave function (more properly the “extinction” of states) is a result of decoherence from environmental entanglement. There is a kind of replication in QD, where pointer states are multiplied, and then a kind of environmental selection as well. There is no variation per se, however, though some might argue that the pointer states imprinted by the environment are variants of the originals. Still, it makes the metaphor a bit thin at the edges, but it is close enough for the core idea to fit most of the floor-plan of Darwinism. Indeed, some champion it as part of a more general model for everything. Even selection among viable multiverse bubbles has a similar feel to it: some survive while others perish.

I’ve been simultaneously studying quantum computing and complexity theories that are getting impressively well developed. Richard Cleve’s An Introduction to Quantum Complexity Theory and John Watrous’s Quantum Computational Complexity are notable in their bridging from traditional computational complexity to this newer world of quantum computing using qubits, wave functions, and even decoherence gates.

Decoherence sucks for quantum computing in general, but there may be a way to make use of it. For instance, an artificial neural network (ANN) also has some interesting Darwinian-like properties to it. The initial weight distribution in an ANN is typically a random real value. This is designed to simulate the relative strength of neural connections. Real neural connections are much more complex than this, doing interesting cyclic behavior, saturating and suppressing based on neurotransmitter availability, and so forth, but assuming just a straightforward pattern of connectivity has allowed for significant progress.… Read the rest

Bullshit, Metaphors, and Political Precision

Given this natural condition of uncertainty in the meaning of words, and their critical role in communication, to say the least, we can certainly expect that as we move away from the sciences towards other areas of human endeavor we have even greater vagueness in trying to express complex ideas. Politics is an easy example. America’s current American president is a babbling bullshitter, to use the explanatory framework of the essay, On Bullshit, and he is easy to characterize as an idiot, like when he conflates Western liberalism with something going on exclusively in modern California.

In this particular case, we have to track down what “liberal” means and meant at various times, then try to suss out how that meaning is working today. At one time, the term was simply expressive of freedom with minimal government interference. Libertarians still carry a version of that meaning forward, but liberalism also came to mean something akin to a political focus on government spending to right perceived economic and social disparities (to achieve “freedom from want and despair,” via FDR). And then it began to be used as a pejorative related to that same focus.

As linguist John McWhorter points out, abstract ideas—and perhaps especially political ones—are so freighted with their pragmatic and historical background that the best we can say is that we are actively working out what a given term means. McWhorter suggests that older terms like “socialist” are impossible to put to work effectively; a newer term like “progressive” is more desirable because it carries less baggage.

An even stronger case is made by George Lakoff where he claims central metaphors that look something like Freudian abstractions govern political perspectives.… Read the rest

Two Points on Penrose, and One On Motivated Reasoning

Sir Roger Penrose is, without doubt, one of the most interesting polymaths of recent history. Even where I find his ideas fantastical, they are most definitely worth reading and understanding. Sean Carroll’s Mindscape podcast interview with Penrose from early January of this year is a treat.

I’ve previously discussed the Penrose-Hameroff conjectures concerning wave function collapse and their implication of quantum operations in the micro-tubule structure of the brain. I also used the conjecture in a short story. But the core driver for Penrose’s original conjecture, namely that algorithmic processes can’t explain human consciousness, has always been a claim in search of support. Equally difficult is pushing consciousness into the sphere of quantum phenomena that tend to show random, rather than directed, behavior. Randomness doesn’t clearly relate to the “hard problem” of consciousness that is about the experience of being conscious.

But take the idea that since mathematicians can prove things that are blocked by Gödel incompleteness, our brains must be different from Turing machines or collections of them. Our brains are likely messy and not theorem proving machines per se, despite operating according to logico-causal processes. Indeed, throw in an active analog to biological evolution based on variation-and-retention of ideas and insights that might actually have a bit of pseudo-randomness associated with it, and there is no reason to doubt that we are capable of the kind of system transcendence that Penrose is looking for.

Note that this doesn’t in any way impact the other horn of Penrose-Hameroff concerning the measurement problem in quantum theory, but there is no reason to suspect that quantum collapse is necessary for consciousness. It might flow the other way, though, and Penrose has created the Penrose Institute to look experimentally for evidence about these effects.… Read the rest

Theoretical Reorganization

Sean Carroll of Caltech takes on the philosophy of science in his paper, Beyond Falsifiability: Normal Science in a Multiverse, as part of a larger conversation on modern theoretical physics and experimental methods. Carroll breaks down the problems of Popper’s falsification criterion and arrives at a more pedestrian Bayesian formulation for how to view science. Theories arise, theories get their priors amplified or deflated, that prior support changes due to—often for Carroll—coherence reasons with other theories and considerations and, in the best case, the posterior support improves with better experimental data.

Continuing with the previous posts’ work on expanding Bayes via AIT considerations, the non-continuous changes to a group of scientific theories that arrive with new theories or data require some better model than just adjusting priors. How exactly does coherence play a part in theory formation? If we treat each theory as a binary string that encodes a Turing machine, then the best theory, inductively, is the shortest machine that accepts the data. But we know that there is no machine that can compute that shortest machine, so there needs to be an algorithm that searches through the state space to try to locate the minimal machine. Meanwhile, the data may be varying and the machine may need to incorporate other machines that help improve the coverage of the original machine or are driven by other factors, as Carroll points out:

We use our taste, lessons from experience, and what we know about the rest of physics to help guide us in hopefully productive directions.

The search algorithm is clearly not just brute force in examining every micro variation in the consequences of changing bits in the machine. Instead, large reusable blocks of subroutines get reparameterized or reused with variation.… Read the rest

Free Will and Algorithmic Information Theory (Part II)

Bad monkey

So we get some mild form of source determinism out of Algorithmic Information Complexity (AIC), but we haven’t addressed the form of free will that deals with moral culpability at all. That free will requires that we, as moral agents, are capable of making choices that have moral consequences. Another way of saying it is that given the same circumstances we could have done otherwise. After all, all we have is a series of if/then statements that must be implemented in wetware and they still respond to known stimuli in deterministic ways. Just responding in model-predictable ways to new stimuli doesn’t amount directly to making choices.

Let’s expand the problem a bit, however. Instead of a lock-and-key recognition of integer “foodstuffs” we have uncertain patterns of foodstuffs and fallible recognition systems. Suddenly we have a probability problem with P(food|n) [or even P(food|q(n)) where q is some perception function] governed by Bayesian statistics. Clearly we expect evolution to optimize towards better models, though we know that all kinds of historical and physical contingencies may derail perfect optimization. Still, if we did have perfect optimization, we know what that would look like for certain types of statistical patterns.

What is an optimal induction machine? AIC and variants have been used to define that machine. First, we have Solomonoff induction from around 1960. But we also have Jorma Rissanen’s Minimum Description Length (MDL) theory from 1978 that casts the problem more in terms of continuous distributions. Variants are available, too, from Minimum Message Length, to Akaike’s Information Criterion (AIC, confusingly again), Bayesian Information Criterion (BIC), and on to Structural Risk Minimization via Vapnik-Chervonenkis learning theory.

All of these theories involve some kind of trade-off between model parameters, the relative complexity of model parameters, and the success of the model on the trained exemplars.… Read the rest