Free Will and Algorithmic Information Theory (Part II)

Bad monkey

So we get some mild form of source determinism out of Algorithmic Information Complexity (AIC), but we haven’t addressed the form of free will that deals with moral culpability at all. That free will requires that we, as moral agents, are capable of making choices that have moral consequences. Another way of saying it is that given the same circumstances we could have done otherwise. After all, all we have is a series of if/then statements that must be implemented in wetware and they still respond to known stimuli in deterministic ways. Just responding in model-predictable ways to new stimuli doesn’t amount directly to making choices.

Let’s expand the problem a bit, however. Instead of a lock-and-key recognition of integer “foodstuffs” we have uncertain patterns of foodstuffs and fallible recognition systems. Suddenly we have a probability problem with P(food|n) [or even P(food|q(n)) where q is some perception function] governed by Bayesian statistics. Clearly we expect evolution to optimize towards better models, though we know that all kinds of historical and physical contingencies may derail perfect optimization. Still, if we did have perfect optimization, we know what that would look like for certain types of statistical patterns.

What is an optimal induction machine? AIC and variants have been used to define that machine. First, we have Solomonoff induction from around 1960. But we also have Jorma Rissanen’s Minimum Description Length (MDL) theory from 1978 that casts the problem more in terms of continuous distributions. Variants are available, too, from Minimum Message Length, to Akaike’s Information Criterion (AIC, confusingly again), Bayesian Information Criterion (BIC), and on to Structural Risk Minimization via Vapnik-Chervonenkis learning theory.

All of these theories involve some kind of trade-off between model parameters, the relative complexity of model parameters, and the success of the model on the trained exemplars.… Read the rest

Free Will and Algorithmic Information Theory

I was recently looking for examples of applications of algorithmic information theory, also commonly called algorithmic information complexity (AIC). After all, for a theory to be sound is one thing, but when it is sound and valuable it moves to another level. So, first, let’s review the broad outline of AIC. AIC begins with the problem of randomness, specifically random strings of 0s and 1s. We can readily see that given any sort of encoding in any base, strings of characters can be reduced to a binary sequence. Likewise integers.

Now, AIC states that there are often many Turing machines that could generate a given string and, since we can represent those machines also as a bit sequence, there is at least one machine that has the shortest bit sequence while still producing the target string. In fact, if the shortest machine is as long or a bit longer (given some machine encoding requirements), then the string is said to be AIC random. In other words, no compression of the string is possible.

Moreover, we can generalize this generator machine idea to claim that given some set of strings that represent the data of a given phenomena (let’s say natural occurrences), the smallest generator machine that covers all the data is a “theoretical model” of the data and the underlying phenomena. An interesting outcome of this theory is that it can be shown that there is, in fact, no algorithm (or meta-machine) that can find the smallest generator for any given sequence. This is related to Turing Incompleteness.

In terms of applications, Gregory Chaitin, who is one of the originators of the core ideas of AIC, has proposed that the theory sheds light on questions of meta-mathematics and specifically that it demonstrates that mathematics is a quasi-empirical pursuit capable of producing new methods rather than being idealistically derived from analytic first-principles.… Read the rest

The Elusive in Art and Artificial Intelligence

Per caption.
Deep Dream (deepdreamgenerator.com) of my elusive inner Van Gogh.

How exactly deep learning models do what they do is at least elusive. Take image recognition as a task. We know that there are decision-making criteria inferred by the hidden layers of the networks. In Convolutional Neural Networks (CNNs), we have further knowledge that locally-receptive fields (or their simulated equivalent) provide a collection of filters that emphasize image features in different ways, from edge detection to rotation-invariant reductions prior to being subjected to a learned categorizer. Yet, the dividing lines between a chair and a small loveseat, or between two faces, is hidden within some non-linear equation composed of these field representations with weights tuned by exemplar presentation.

This elusiveness was at least part of the reason that neural networks and, generally, machine learning-based approaches have had a complicated position in AI research; if you can’t explain how they work, or even fairly characterize their failure modes, maybe we should work harder to understand the support for those decision criteria rather than just build black boxes to execute them?

So when groups use deep learning to produce visual artworks like the recently auctioned work sold by Christie’s for USD 432K, we can be reassured that the murky issue of aesthetics in art appreciation is at least paired with elusiveness in the production machine.

Or is it?

Let’s take Wittgenstein’s ideas about aesthetics as a perhaps slightly murky point of comparison. In Wittgenstein, we are almost always looking at what are effectively games played between and among people. In language, the rules are shared in a culture, a community, and even between individuals. These are semantic limits, dialogue considerations, standardized usages, linguistic pragmatics, expectations, allusions, and much more.… Read the rest

Indifference and the Cosmos

I am a political independent, though that does not mean that I vote willy-nilly. I have, in fact, been reliably center left for most of my adult life, save one youthfully rebellious moment when I voted Libertarian, more as a statement than a commitment to the principles of libertarianism per se. I regret that vote now, given additional exposure to the party and the kinds of people it attracts. To me, the extremes of the American political system build around radical positions, and the increasingly noxious conspiracy theories and unhinged rhetoric is nothing like the cautious, problem-solving utopia that might make me politically happy, or at least wince less.

Some might claim I am indifferent. I would not argue with that. In the face of revolution, I would require a likely impossible proof of a better outcome before committing. How can we possibly see into such a permeable and contingent future, or weigh the goods and harms in the face of the unknown? This idea of indifference, as a tempering of our epistemic insights, serves as a basis for an essential idea in probabilistic reasoning where it even has the name, the principle of indifference, or, variously, and in contradistinction with Leibniz’s principle of sufficient reason, the principle of insufficient reason.

So how does indifference work in probabilistic reasoning? Consider a Bayesian formulation: we inductively guess based on a combination of a priori probabilities combined with a posteriori evidences. What is the likelihood of the next word in an English sentence being “is”? Indifference suggests that we treat each word as likely as any other, but we know straight away that “is” occurs much more often than “Manichaeistic” in English texts because we can count words.… Read the rest

Incompressibility and the Mathematics of Ethical Magnetism

One of the most intriguing aspects of the current U.S. border crisis is the way that human rights and American decency get articulated in the public sphere of discourse. An initial pull is raw emotion and empathy, then there are counterweights where the long-term consequences of existing policies are weighed against the exigent effects of the policy, and then there are crackpot theories of “crisis actors” and whatnot as bizarro-world distractions. But, if we accept the general thesis of our enlightenment values carrying us ever forward into increasing rights for all, reduced violence and war, and the closing of the curtain on the long human history of despair, poverty, and hunger, we must also ask more generally how this comes to be. Steven Pinker certainly has rounded up some social theories, but what kind of meta-ethics might be at work that seems to push human civilization towards these positive outcomes?

Per the last post, I take the position that we can potentially formulate meaningful sentences about what “ought” to be done, and that those meaningful sentences are, in fact, meaningful precisely because they are grounded in the semantics we derive from real world interactions. How does this work? Well, we can invoke the so-called Cornell Realists argument that the semantics of a word like “ought” is not as flexible as Moore’s Open Question argument suggests. Indeed, if we instead look at the natural world and the theories that we have built up about it (generally “scientific theories” but, also, perhaps “folk scientific ideas” or “developing scientific theories”), certain concepts take on the character of being so-called “joints of reality.” That is, they are less changeable than other concepts and become referential magnets that have an elite status among the concepts we use for the world.… Read the rest

Running, Ancient Roman Science, Arizona Dive Bars, and Lightning Machine Learning

I just returned from running in Chiricahua National Monument, Sedona, Painted Desert, and Petrified Forest National Park, taking advantage of the late spring before the heat becomes too intense. Even so, though I got to Massai Point in Chiricahua through 90+ degree canyons and had around a liter of water left, I still had to slow down and walk out after running short of liquid nourishment two-thirds down. There is an eerie, uncertain nausea that hits when hydration runs low under high stress. Cliffs and steep ravines take on a wolfish quality. The mind works to control feet against stumbling and the lips get serrated edges of parched skin that bite off without relieving the dryness.

I would remember that days later as I prepped to overnight with a wilderness permit in Petrified Forest only to discover that my Osprey Exos pack frame had somehow been bent, likely due to excessive manhandling by airport checked baggage weeks earlier. I considered my options and drove eighty miles to Flagstaff to replace the pack, then back again.

I arrived in time to join Dr. Richard Carrier in an unexpected dive bar in Holbrook, Arizona as the sunlight turned to amber and a platoon of Navajo pool sharks descended on the place for billiards and beers. I had read that Dr. Carrier would be stopping there and it was convenient to my next excursion, so I picked up signed copies of his new book, The Scientist in the Early Roman Empire, as well as his classic, On the Historicity of Jesus, that remains part of the controversial samizdat of so-called “Jesus mythicism.”

If there is a distinguishing characteristic of OHJ it is the application of Bayesian Theory to the problems of historical method.… Read the rest

Instrumentality and Terror in the Uncanny Valley

I got an Apple HomePod the other day. I have several Airplay speakers already, two in one house and a third in my separate office. The latter, a Naim Mu-So, combines Airplay with internet radio and bluetooth, but I mostly use it for the streaming radio features (KMozart, KUSC, Capital Public Radio, etc.). The HomePod’s Siri implementation combined with Apple Music allows me to voice control playlists and experiment with music that I wouldn’t generally have bothered to buy and own. I can now sample at my leisure without needing to broadcast via a phone or tablet or computer. Steve Reich, Bill Evans, Theolonius Monk, Bach organ mixes, variations of Tristan and Isolde, and, yesterday, when I asked for “workout music” I was gifted with Springsteen’s Born to Run, which I would never have associated with working out, but now I have dying on the mean streets of New Jersey with Wendy in some absurd drag race conflagration replaying over and over again in my head.

Right after setup, I had a strange experience. I was shooting random play thoughts to Siri, then refining them and testing the limits. There are many, as reviewers have noted. Items easily found in Apple Music are occasionally fails for Siri in HomePod, but simple requests and control of a few HomeKit devices work acceptably. The strange experience was my own trepidation over barking commands at the device, especially when I was repeating myself: “Hey Siri. Stop. Play Bill Evans. Stop. Play Bill Evans’ Peace Piece.” (Oh my, homophony, what will happen? It works.) I found myself treating Siri as a bit of a human being in that I didn’t want to tell her to do a trivial task that I had just asked her to perform.… Read the rest

Black and Gray Boxes with Autonomous Meta-Cognition

Vijay Pande of VC Andreessen Horowitz (who passed on my startups twice but, hey, it’s just business!) has a relevant article in New York Times concerning fears of the “black box” of deep learning and related methods: is the lack of explainability and limited capacity for interrogation of the underlying decision making a deal-breaker for applications to critical areas like medical diagnosis or parole decisions? His point is simple, and related to the previous post’s suggestion of the potential limitations of our capacity to truly understand many aspects of human cognition. Even the doctor may only be able to point to a nebulous collection of clinical experiences when it comes to certain observational aspects of their jobs, like in reading images for indicators of cancer. At least the algorithm has been trained on a significantly larger collection of data than the doctor could ever encounter in a professional lifetime.

So the human is almost as much a black box (maybe a gray box?) as the algorithm. One difference that needs to be considered, however, is that the deep learning algorithm might make unexpected errors when confronted with unexpected inputs. The classic example from the early history of artificial neural networks involved a DARPA test of detecting military tanks in photographs. The apocryphal to legendary formulation of the story is that there was a difference in the cloud cover between the tank images and the non-tank images. The end result was that the system performed spectacularly on the training and test data sets but then failed miserably on new data that lacked the cloud cover factor. I recalled this slightly differently recently and substituted film grain for the cloudiness. In any case, it became a discussion point about the limits of data-driven learning that showed how radically incorrect solutions could be created without careful understanding of how the systems work.… Read the rest

Deep Simulation in the Southern Hemisphere

I’m unusually behind in my postings due to travel. I’ve been prepping for and now deep inside a fresh pass through New Zealand after two years away. The complexity of the place seems to have a certain draw for me that has lured me back, yet again, to backcountry tramping amongst the volcanoes and glaciers, and to leasurely beachfront restaurants painted with eruptions of summer flowers fueled by the regular rains.

I recently wrote a technical proposal that rounded up a number of the most recent advances in deep learning neural networks. In each case, like with Google’s transformer architecture, there is a modest enhancement that is based on a realization of a deficit in the performance of one of two broad types of networks, recurrent and convolutional.

An old question is whether we learn anything about human cognition if we just simulate it using some kind of automatically learning mechanism. That is, if we use a model acquired through some kind of supervised or unsupervised learning, can we say we know anything about the original mind and its processes?

We can at least say that the learning methodology appears to be capable of achieving the technical result we were looking for. But it also might mean something a bit different: that there is not much more interesting going on in the original mind. In this radical corner sits the idea that cognitive processes in people are tactical responses left over from early human evolution. All you can learn from them is that they may be biased and tilted towards that early human condition, but beyond that things just are the way they turned out.

If we take this position, then, we might have to discard certain aspects of the social sciences.… Read the rest

I, Robot and Us

What happens if artificial intelligence (AI) technologies become significant economic players? The topic has come up in various ways for the past thirty years, perhaps longer. One model, the so-called technological singularity, posits that self-improving machines may be capable of a level of knowledge generation and disruption that will eliminate humans from economic participation. How far out this singularity might be is a matter of speculation, but I have my doubts that we really understand intelligence enough to start worrying about the impacts of such radical change.

Barring something essentially unknowable because we lack sufficient priors to make an informed guess, we can use evidence of the impact of mechanization on certain economic sectors, like agribusiness or transportation manufacturing, to try to plot out how mechanization might impact other sectors. Aghion, Jones, and Jones’ Artificial Intelligence and Economic Growth, takes a deep dive into the topic. The math is not particularly hard, though the reasons for many of the equations are tied up in macro and microeconomic theory that requires a specialist’s understanding to fully grok.

Of special interest are the potential limiting role of inputs and organizational competition. For instance, automation speed-ups may be limited by human limitations within the economic activity. This may extend even further due to fundamental limitations of physics for a given activity. The pointed example is that power plants are limited by thermodynamics; no amount of additional mechanization can change that. Other factors related to inputs or the complexity of a certain stage of production may also drag economic growth to a capped, limiting level.

Organizational competition and intellectual property considerations come into play, as well. While the authors suggest that corporations will remain relevant, they should become more horizontal by eliminating much of the middle tier of management and outsourcing components of their productivity.… Read the rest