Free Will and Algorithmic Information Theory (Part II)

Bad monkey

So we get some mild form of source determinism out of Algorithmic Information Complexity (AIC), but we haven’t addressed the form of free will that deals with moral culpability at all. That free will requires that we, as moral agents, are capable of making choices that have moral consequences. Another way of saying it is that given the same circumstances we could have done otherwise. After all, all we have is a series of if/then statements that must be implemented in wetware and they still respond to known stimuli in deterministic ways. Just responding in model-predictable ways to new stimuli doesn’t amount directly to making choices.

Let’s expand the problem a bit, however. Instead of a lock-and-key recognition of integer “foodstuffs” we have uncertain patterns of foodstuffs and fallible recognition systems. Suddenly we have a probability problem with P(food|n) [or even P(food|q(n)) where q is some perception function] governed by Bayesian statistics. Clearly we expect evolution to optimize towards better models, though we know that all kinds of historical and physical contingencies may derail perfect optimization. Still, if we did have perfect optimization, we know what that would look like for certain types of statistical patterns.

What is an optimal induction machine? AIC and variants have been used to define that machine. First, we have Solomonoff induction from around 1960. But we also have Jorma Rissanen’s Minimum Description Length (MDL) theory from 1978 that casts the problem more in terms of continuous distributions. Variants are available, too, from Minimum Message Length, to Akaike’s Information Criterion (AIC, confusingly again), Bayesian Information Criterion (BIC), and on to Structural Risk Minimization via Vapnik-Chervonenkis learning theory.

All of these theories involve some kind of trade-off between model parameters, the relative complexity of model parameters, and the success of the model on the trained exemplars. In smoothing or minimizing the complexity of the predictive model we are certain that the model is optimal with respect to future predictions.

But now we can see the problem. Everything is possible and some things are sometimes more likely than others. But even the perceptions are loaded with risk that has to be modeled. Given our meta-models-upon-meta-models approach to functional decision making described in the last post, at some point we have to have a module that pulls the trigger in the face of radical uncertainty. It emerged through deterministic actions within the environment. It’s implemented on wetware that leads to deterministic firing of each peer or submodule. But the module itself can sometimes make a choice relative to the inputs and the current state that is unconnected to those states because it has an element of indeterminism associated with the model itself. That may be in error given other criteria like social norms or even best survival odds from a rational calculus, or even defy the push-pull of other modules that are contending for attention.

So a woman thinks about murdering someone who wronged her. She knows that murder is generally wrong except when directed by society, blah, blah. She knows that she might get caught. She knows all the pulls against murder, but she also has rage and a desire for revenge. This sounds premeditated so she tosses and turns for weeks trying to make a decision. The modules are all being weighed and reweighed against the evidence, against paths for alternative justice, against childhood moral impositions. In the end, though, she chooses to murder the person because the different weighted inputs and inter-modular influences amount to a toss-up within a fuzzy range of indeterminacy. The decision is, effectively, willed among the contending choices and she is morally culpable given modern standards.

Leave a Reply

Your email address will not be published. Required fields are marked *