Notes on Pumps: Sensibilities and Framing with Algorithmic Feedback

“A sensibility is one of the hardest things to talk about.” So begins Sontag’s Notes on “Camp” in the 1964 Partisan Review. And what of the political anger and disillusionment across the United States and in the developed world? What of the gnawing desire towards superiority and control that accompanies authoritarian urges? What of the fear of loss of power to minority ethnic and religious groups? These may be the most discussed sociopolitical aspects of our modern political sensibility since Trump’s election in 2016 when a bitter, vindictive, hostile, crude, fat thug briefly took the reigns of America, then pushed and conspired to oppose the election of his successor.

What attracted his followers to him? I never encountered a George W. Bush fanatic during his presidency. Though not physically small, he talked about “compassionate conservatism” with a voice that hung in the upper register of middle pitches for men. He was neither sonorous nor mean. His eyebrows often had a look of surprise and self-doubt that was hinted at in claims he was a very reluctant candidate for president. I met people who voted for him but they seemed to accept him as an acceptable alternative to Gore or, later, to Kerry—not as a figure of passionate intrigue. Bush Jr. did receive a rally-around-the-flag effect that was based on circumstances that would later bring rebuke over the casus belli of the Iraq War. Similar sensibilities were true of the Obama years—there was a low positivity for him on the Left combined with a mildly deranged antagonism towards him on the Right.

Was the lack of Trump-like animating fanaticism due to the feeling that Bush Jr. was a compromise made to the electorate while Trump was, finally, a man who expressed the real hostility of those who vote Republican?… Read the rest

We Are Weak Chaos

Recent work in deep learning networks has been largely driven by the capacity of modern computing systems to compute gradient descent over very large networks. We use gaming cards with GPUs that are great for parallel processing to perform the matrix multiplications and summations that are the primitive operations central to artificial neural network formalisms. Conceptually, another primary advance is the pre-training of networks as autocorrelators that helps with smoothing out later “fine tuning” training programs over other data. There are some additional contributions that are notable in impact and that reintroduce the rather old idea of recurrent neural networks, networks with outputs attached back to inputs that create resonant kinds of running states within the network. The original motivation of such architectures was to emulate the vast interconnectivity of real neural systems and to capture a more temporal appreciation of data where past states affect ongoing processing, rather than a pure feed-through architecture. Neural networks are already nonlinear systems, so adding recurrence just ups the complexity of trying to figure out how to train them. Treating them as black boxes and using evolutionary algorithms was fashionable for me in the 90s, though the computing capabilities just weren’t up for anything other than small systems, as I found out when chastised for overusing a Cray at Los Alamos.

But does any of this have anything to do with real brain systems? Perhaps. Here’s Toker, et. al. “Consciousness is supported by near-critical slow cortical electrodynamics,” in Proceedings of the National Academy of Sciences (with the unenviable acronym PNAS). The researchers and clinicians studied the electrical activity of macaque and human brains in a wide variety of states: epileptics undergoing seizures, macaque monkeys sleeping, people on LSD, those under the effects of anesthesia, and people with disorders of consciousness.… Read the rest

Triangulation Machinery, Poetry, and Politics

I was reading Muriel Rukeyser‘s poetry and marveling at some of the lucid yet novel constructions she employs. I was trying to avoid the grueling work of comparing and contrasting Biden’s speech on the anniversary of January 6th, 2021 with the responses from various Republican defenders of Trump. Both pulled into focus the effect of semantic and pragmatic framing as part of the poetic and political processes, respectively. Sorry, Muriel, I just compared your work to the slow boil of democracy.

Reaching in interlaced gods, animals, and men.
There is no background. The figures hold their peace
In a web of movement. There is no frustration,
Every gesture is taken, everything yields connections.

There is a theory about how language works that I’ve discussed here before. In this theory, from Donald Davidson primarily, the meaning of words and phrases are tied directly to a shared interrogation of what each person is trying to convey. Imagine a child observing a dog and a parent says “dog” and is fairly consistent with that usage across several different breeds that are presented to the child. The child may overuse the word, calling a cat a dog at some point, at which point the parent corrects the child with “cat” and the child proceeds along through this interrogatory process, triangulating in on the meaning of dog versus cat. Triangulation is Davidson’s term, reflecting three parties: two people discussing a thing or idea. In the case of human children, we also know that there are some innate preferences the child will apply during the triangulation process, like preferring “whole object” semantics to atomized ones, and assuming different words mean different things even when applied to the same object: so “canine” and “dog” must refer to the same object in slightly different ways since they are differing words, and indeed they do: dog IS-A canine but not vice-versa.… Read the rest

A Learning Smorgasbord

Compliments of a discovery by Futurism, the paper The Autodidactic Universe by a smorgasbord of contemporary science and technology thinkers caught my attention for several reasons. First was Jaron Lanier as a co-author. I knew Jaron’s dad, Ellery, when I was a researcher at NMSU’s now defunct Computing Research Laboratory. Ellery had returned to school to get his psychology PhD during retirement. In an odd coincidence, my brother had also rented a trailer next to the geodesic dome Jaron helped design and Ellery lived after my brother became emancipated in his teens. Ellery may have been his landlord, but I am not certain of that.

The paper is an odd piece of kit that I read over two days in fits and spurts with intervening power lifting interludes (I recently maxed out my Bowflex and am considering next steps!). It initially has the feel of physicists trying to reach into machine learning as if the domain specialists clearly missed something that the hardcore physical scientists have known all along. But that concern dissipated fairly quickly and the paper settled into showing isomorphisms between various physical theories and the state evolution of neural networks. OK, no big deal. Perhaps they were taken by the realization that the mathematics of tensors was a useful way to describe network matrices and gradient descent learning. They then riffed on that and looked at the broader similarities between the temporal evolution of learning and quantum field theory, approaches to quantum gravity, and cosmological ideas.

The paper, being a smorgasbord, then investigates the time evolution of graphs using a lens of graph theory. The core realization, as I gleaned it, is that there are more complex graphs (visually as well as based on the diversity of connectivity within the graph) and pointlessly uniform or empty ones.… Read the rest

Distributed Contexts in the Language Game

The meaning of words and phrases can be a bit hard to pin down. Indeed, the meaning of meaning itself is problematical. I can point to a dictionary and say, well, there is where we keep the meanings of things, but that is just a record of the way in which we use the language. I’m personally fond of a kind of philosophical perspective on this matter of meaning that relies on a form of holism. That is, words and their meanings are defined by our usages of them, our historical interactions with them in different contexts, and subtle distinctive cues that illuminate how words differ and compare. Often, but not always, the words are tied to things in the world, as well, and therefore have a fastness that resists distortions and distinctions.

This is, of course, a critical area of inquiry when trying to create intelligent machines that deal with language. How do we imbue the system with meaning, represent it within the machine, and apply it to novel problems that show intelligent behavior? In approaching the problem, we must therefore be achieving some semblance of intelligence in a fairly rigorous way since we are simulating it with logical steps.

The history of philosophical and linguistic interest in these topics is fascinating, ranging from Wittgenstein’s notion of a language game that builds up rules of use to Firth’s expansion to formalization of collocation of words as critical to meaning. In artificial intelligence, this concept of collocation has been expanded further to include interchangeability of contexts. Thus, boat and ship occur in more similar contexts than boat and bank.

A general approach to acquiring these contexts is based on the idea of dimensionality reduction in various forms.… Read the rest

Intelligent Borrowing

There has been a continuous bleed of biological, philosophical, linguistic, and psychological concepts into computer science since the 1950s. Artificial neural networks were inspired by real ones. Simulated evolution was designed around metaphorical patterns of natural evolution. Philosophical, linguistic, and psychological ideas transferred as knowledge representation and grammars, both natural and formal.

Since computer science is a uniquely synthetic kind of science and not quite a natural one, borrowing and applying metaphors seems to be part of the normal mode of advancement in this field. There is a purely mathematical component to the field in the fundamental questions around classes of algorithms and what is computable, but there are also highly synthetic issues that arise from architectures that are contingent on physical realizations. Finally, the application to simulating intelligent behavior relies largely on three separate modes of operation:

  1. Hypothesize about how intelligent beings perform such tasks
  2. Import metaphors based on those hypotheses
  3. Given initial success, use considerations of statistical features and their mappings to improve on the imported metaphors (and, rarely, improve with additional biological insights)

So, for instance, we import a simplified model of neural networks as connected sets of weights representing some kind of variable activation or inhibition potentials combined with sudden synaptic firing. Abstractly we already have an interesting kind of transfer function that takes a set of input variables and has a nonlinear mapping to the output variables. It’s interesting because being nonlinear means it can potentially compute very difficult relationships between the input and output.

But we see limitations, immediately, and these are observed in the history of the field. For instance, if you just have a single layer of these simulated neurons, the system isn’t fundamentally complex enough to compute any complex functions, so we add a few layers and then more and more.… Read the rest

The Pregnant Machinery of Metaphor

Sylvia Plath has some thoughts about pregnancy in “Metaphors”:

I’m a riddle in nine syllables,
An elephant, a ponderous house,
A melon strolling on two tendrils.
O red fruit, ivory, fine timbers!
This loaf’s big with its yeasty rising.
Money’s new-minted in this fat purse.
I’m a means, a stage, a cow in calf.
I’ve eaten a bag of green apples,
Boarded the train there’s no getting off.

It seems, at first blush, that metaphors have some creative substitutive similarity to the concept they are replacing. We can imagine Plath laboring over this child in nine lines, fitting the pieces together, cutting out syllables like dangling umbilicals, finding each part in a bulging conception, until it was finally born, alive and kicking.

OK, sorry, I’ve gone too far, fallen over a cliff, tumbled down through a ravine, dropped into the foaming sea, where I now bob, like your uncle. Stop that!

Let’s assume that much human creativity occurs through a process of metaphor or analogy making. This certainly seems to be the case in aspects of physics when dealing with difficult to understand new realms of micro- and macroscopic phenomena, as I’ve noted here. Some creative fields claim a similar basis for their work, with poetry being explicit about the hidden or secret meaning of poems. Moreover, I will also suppose that a similar system operates in terms of creating networks of semantics by which we understand the meaning of words and their relationships to phenomena external to us, as well as our own ideas. In other words, semantics are a creative puzzle for us.

What do we know about this system and how can we create abstract machines that implement aspects of it?… Read the rest

One Shot, Few Shot, Radical Shot

Exunoplura is back up after a sad excursion through the challenges of hosting providers. To be blunt, they mostly suck. Between systems that just don’t work right (SSL certificate provisioning in this case) and bad to counterproductive support experiences, it’s enough to make one want to host it oneself. But hosting is mostly, as they say of war, long boring periods punctuated by moments of terror as things go frustratingly sideways. But we are back up again after two hosting provider side-trips!

Honestly, I’d like to see an AI agent effectively navigate through these technological challenges. Where even human performance is fleeting and imperfect, the notion that an AI could learn how to deal with the uncertain corners of the process strikes me as currently unthinkable. But there are some interesting recent developments worth noting and discussing in the journey towards what is named “general AI” or a framework that is as flexible as people can be, rather than narrowly tied to a specific task like visually inspecting welds or answering a few questions about weather, music, and so forth.

First, there is the work by the OpenAI folks on massive language models being tested against one-shot or few-shot learning problems. In each of these learning problems, the number of presentations of the training data cases is limited, rather than presenting huge numbers of exemplars and “fine tuning” the response of the model. What is a language model? Well, it varies across different approaches, but typically is a weighted context of words of varying length, with the weights reflecting the probabilities of those words in those contexts over a massive collection of text corpora. For the OpenAI model, GPT-3, the total number of parameters (words/contexts and their counts) is an astonishing 175 billion using 45 Tb of text to train the model.… Read the rest

Ensembles Against Abominables

It seems obvious to me that when we face existential threats we should make the best possible decisions. I do this with respect to investment decisions, as well. I don’t rely on “guts” or feelings or luck or hope or faith or hunches or trends. All of those ideas are proxies for some sense of incompleteness in our understanding of probabilities and future outcomes.

So how can we cope with those kinds of uncertainties given existential threats? The core methodology is based on ensembles of predictions. We don’t actually want to trust an expert per se, but want instead to trust a basket of expert opinions—an ensemble of predictions. Ideally, those experts who have been more effective in the past should be given greater weight than those who have made poorer predictions. We most certainly should not rely on gut calls by abominable narcissists in what Chauncey Devega at Salon disturbingly characterizes as a “pathological kakistocracy.”

Investment decision-making takes exactly this form, when carried out rationally. Index funds adjust their security holdings in relationship to an index like the S&P 500. Since stock markets have risen since their inceptions with, of course, set backs along the way, an index is a reliable ensemble approach to growth. Ensembles smooth predictions and smooth out brittleness.

Ensemble methods are also core to predictive improvements in machine learning. While a single decision tree trained on data may overweight portions of the data set, an ensemble of trees (which we call a forest, of course) smoothes the decision making by having each tree become only a part of the final vote for a prediction. The training of the individual trees is based on a randomized subset of the data, allowing for specialization of stands of trees, but preserving overall effectiveness of the system.… Read the rest

Forever Uncanny

Quanta has a fair round up of recent advances in deep learning. Most interesting is the recent performance on natural language understanding tests that are close to or exceed mean human performance. Inevitably, John Searle’s Chinese Room argument is brought up, though the author of the Quanta article suggests that inferring the Chinese translational rule book from the data itself is slightly different from the original thought experiment. In the Chinese Room there is a person who knows no Chinese but has a collection of translational reference books. She receives texts through a slot and dutifully looks up the translation of the text and passes out the result. “Is this intelligence?” is the question and it serves as a challenge to the Strong AI hypothesis. With statistical machine translation methods (and their alternative mechanistic implementation, deep learning), the rule books have been inferred by looking at translated texts (“parallel” texts as we say in the field). By looking at a large enough corpus of parallel texts, greater coverage of translated variants is achieved as well as some inference of pragmatic issues in translation and corner cases.

As a practical matter, it should be noted that modern, professional translators often use translation memory systems that contain idiomatic—or just challenging—phrases that they can reference when translating new texts. The understanding resides in the original translator’s head, we suppose, and in the correct application of the rule to the new text by checking for applicability according to, well, some other criteria that the translator brings to bear on the task.

In the General Language Understand Evaluation (GLUE) tests described in the Quanta article, the systems are inferring how to answer Wh-style queries (who, what, where, when, and how) as well as identify similar texts.… Read the rest