Forever Uncanny

Quanta has a fair round up of recent advances in deep learning. Most interesting is the recent performance on natural language understanding tests that are close to or exceed mean human performance. Inevitably, John Searle’s Chinese Room argument is brought up, though the author of the Quanta article suggests that inferring the Chinese translational rule book from the data itself is slightly different from the original thought experiment. In the Chinese Room there is a person who knows no Chinese but has a collection of translational reference books. She receives texts through a slot and dutifully looks up the translation of the text and passes out the result. “Is this intelligence?” is the question and it serves as a challenge to the Strong AI hypothesis. With statistical machine translation methods (and their alternative mechanistic implementation, deep learning), the rule books have been inferred by looking at translated texts (“parallel” texts as we say in the field). By looking at a large enough corpus of parallel texts, greater coverage of translated variants is achieved as well as some inference of pragmatic issues in translation and corner cases.

As a practical matter, it should be noted that modern, professional translators often use translation memory systems that contain idiomatic—or just challenging—phrases that they can reference when translating new texts. The understanding resides in the original translator’s head, we suppose, and in the correct application of the rule to the new text by checking for applicability according to, well, some other criteria that the translator brings to bear on the task.

In the General Language Understand Evaluation (GLUE) tests described in the Quanta article, the systems are inferring how to answer Wh-style queries (who, what, where, when, and how) as well as identify similar texts.… Read the rest

Bereitschaftspotential and the Rehabilitation of Free Will

The question of whether we, as people, have free will or not is both abstract and occasionally deeply relevant. We certainly act as if we have something like libertarian free will, and we have built entire systems of justice around this idea, where people are responsible for choices they make that result in harms to others. But that may be somewhat illusory for several reasons. First, if we take a hard deterministic view of the universe as a clockwork-like collection of physical interactions, our wills are just a mindless outcome of a calculation of sorts, driven by a wetware calculator with a state completely determined by molecular history. Second, there has been, until very recently, some experimental evidence that our decision-making occurs before we achieve a conscious realization of the decision itself.

But this latter claim appears to be without merit, as reported in this Atlantic article. Instead, what was previously believed to be signals of brain activity that were related to choice (Bereitschaftspotential) may just be associated with general waves of neural activity. The new experimental evidence puts the timing of action in line with conscious awareness of the decision. More experimental work is needed—as always—but the tentative result suggests a more tightly coupled pairing of conscious awareness with decision making.

Indeed, the results of this newer experimental result gets closer to my suggested model of how modular systems combined with perceptual and environmental uncertainty can combine to produce what is effectively free will (or at least a functional model for a compatibilist position). Jettisoning the Chaitin-Kolmogorov complexity part of that argument and just focusing on the minimal requirements for decision making in the face of uncertainty, we know we need a thresholding apparatus that fires various responses given a multivariate statistical topology.… Read the rest

Bullshit, Metaphors, and Political Precision

Given this natural condition of uncertainty in the meaning of words, and their critical role in communication, to say the least, we can certainly expect that as we move away from the sciences towards other areas of human endeavor we have even greater vagueness in trying to express complex ideas. Politics is an easy example. America’s current American president is a babbling bullshitter, to use the explanatory framework of the essay, On Bullshit, and he is easy to characterize as an idiot, like when he conflates Western liberalism with something going on exclusively in modern California.

In this particular case, we have to track down what “liberal” means and meant at various times, then try to suss out how that meaning is working today. At one time, the term was simply expressive of freedom with minimal government interference. Libertarians still carry a version of that meaning forward, but liberalism also came to mean something akin to a political focus on government spending to right perceived economic and social disparities (to achieve “freedom from want and despair,” via FDR). And then it began to be used as a pejorative related to that same focus.

As linguist John McWhorter points out, abstract ideas—and perhaps especially political ones—are so freighted with their pragmatic and historical background that the best we can say is that we are actively working out what a given term means. McWhorter suggests that older terms like “socialist” are impossible to put to work effectively; a newer term like “progressive” is more desirable because it carries less baggage.

An even stronger case is made by George Lakoff where he claims central metaphors that look something like Freudian abstractions govern political perspectives.… Read the rest

A Most Porous Barrier

Whenever there is a scientific—or even a quasi-scientific—theory invented, there are those who take an expansive view of the theory, broadly applying it to other areas of thought. This is perhaps inherent in the metaphorical nature of these kinds of thought patterns. Thus we see Darwinian theory influenced by Adam Smith’s “invisible hand” of economic optimization. Then we get Spencer’s Social Darwinism arising from Darwin. And E.O. Wilson’s sociobiology leads to evolutionary psychology, immediately following an activist’s  pitcher of ice water.

The is-ought barrier tends towards porousness, allowing the smuggling of insights and metaphors lifted from the natural world as explanatory footwork for our complex social and political interactions. After all, we are as natural as we are social. But at the same time, we know that science is best when it is tentative and subject to infernal levels of revision and reconsideration. Decisions about social policy derived from science, and especially those that have significant human impact, should be cushioned by a tentative level of trust as well.

E.O. Wilson’s most recent book, Genesis: The Deep Origin of Societies, is a continuation of his late conversion to what is now referred to as “multi-level selection,” where natural selection is believed to operate at multiple levels, from genes to whole societies. It remains a controversial theory that has been under development and under siege since Darwin’s time, when the mechanism of inheritance was not understood.

The book is brief and does not provide much, if any, new material since his Social Conquest of Earth, which was significantly denser and contained notes derived from his controversial 2010 Nature paper that called into question whether kin selection was overstated as a gene-level explanation of altruism and sacrifice within eusocial species.… Read the rest

Doubt at the Limit

I seem to have a central theme to many of the last posts that is related to the demarcation between science and non-science, and also to the limits of what rationality allows where we care about such limits. This is not purely abstract, though, as we can see in today’s anti-science movements, whether anti-vaccination, flat Earthers, climate change deniers, or intelligent design proponents. Just today, Ars Technica reports on the first of these. The speakers at the event, held in close proximity to a massive measles outbreak, ranged from a “disgraced former gastroenterologist” to an angry rabbi. Efforts to counter them, in the form of a letter from a county supervisor and another rabbi, may have had an impact on the broader community, but probably not the die-hards of the movement.

Meanwhile, Lee Mcyntire at Boston University suggests what we are missing in these engagements in a great piece in Newsweek. Mcyntire applies the same argument to flat Earthers that I have applied to climate change deniers: what we need to reinforce is the value and, importantly, the limits inherent in scientific reasoning. Insisting, for example, that climate change science is 100% squared away just fuels the micro-circuits in the so-called meta-cognitive strategies regions of the brains of climate change deniers. Instead, Mcyntire recommends science engages the public in thinking about the limits of science, showing how doubt and process lead us to useable conclusions about topics that are suddenly fashionably in dispute.

No one knows if this approach is superior to the alternatives like the letter-writing method by authorities in the vaccination seminar approach, and it certainly seems longer term in that it needs to build against entrenched ideas and opinions, but it at least argues for a new methodology.… Read the rest

Two Points on Penrose, and One On Motivated Reasoning

Sir Roger Penrose is, without doubt, one of the most interesting polymaths of recent history. Even where I find his ideas fantastical, they are most definitely worth reading and understanding. Sean Carroll’s Mindscape podcast interview with Penrose from early January of this year is a treat.

I’ve previously discussed the Penrose-Hameroff conjectures concerning wave function collapse and their implication of quantum operations in the micro-tubule structure of the brain. I also used the conjecture in a short story. But the core driver for Penrose’s original conjecture, namely that algorithmic processes can’t explain human consciousness, has always been a claim in search of support. Equally difficult is pushing consciousness into the sphere of quantum phenomena that tend to show random, rather than directed, behavior. Randomness doesn’t clearly relate to the “hard problem” of consciousness that is about the experience of being conscious.

But take the idea that since mathematicians can prove things that are blocked by Gödel incompleteness, our brains must be different from Turing machines or collections of them. Our brains are likely messy and not theorem proving machines per se, despite operating according to logico-causal processes. Indeed, throw in an active analog to biological evolution based on variation-and-retention of ideas and insights that might actually have a bit of pseudo-randomness associated with it, and there is no reason to doubt that we are capable of the kind of system transcendence that Penrose is looking for.

Note that this doesn’t in any way impact the other horn of Penrose-Hameroff concerning the measurement problem in quantum theory, but there is no reason to suspect that quantum collapse is necessary for consciousness. It might flow the other way, though, and Penrose has created the Penrose Institute to look experimentally for evidence about these effects.… Read the rest

Narcissism, Nonsense and Pseudo-Science

I recently began posting pictures of our home base in Sedona to Instagram (check it out in column to right). It’s been a strange trip. If you are not familiar with how Instagram works, it’s fairly simple: you post pictures and other Instagram members can “follow” you and you can follow them, meaning that you see their pictures and can tap a little heart icon to show you like their pictures. My goal, if I have one, is just that I like the Northern Arizona mountains and deserts and like thinking about the composition of photographs. I’m also interested in the gear and techniques involved in taking and processing pictures. I did, however, market my own books on the platform—briefly, and with apologies.

But Instagram, like Facebook, is a world unto itself.

Shortly after starting on the platform, I received follows from blond Russian beauties who appear to be marketing online sex services. I have received odd follows from variations on the same name who have no content on their pages and who disappear after a day or two if I don’t follow them back. Though I don’t have any definitive evidence, I suspect these might be bots. I have received follows from people who seemed to be marketing themselves as, well, people—including one who bait-and-switched with good landscape photography. They are typically attractive young people, often showing off their six-pack abs, and trying to build a following with the goal of making money off of Instagram. Maybe they plan to show off products or reference them, thus becoming “influencers” in the lingo of social media. Maybe they are trying to fund their travel experiences by reaping revenue from advertisers that co-exist with their popularity in their image feed.… Read the rest

Theoretical Reorganization

Sean Carroll of Caltech takes on the philosophy of science in his paper, Beyond Falsifiability: Normal Science in a Multiverse, as part of a larger conversation on modern theoretical physics and experimental methods. Carroll breaks down the problems of Popper’s falsification criterion and arrives at a more pedestrian Bayesian formulation for how to view science. Theories arise, theories get their priors amplified or deflated, that prior support changes due to—often for Carroll—coherence reasons with other theories and considerations and, in the best case, the posterior support improves with better experimental data.

Continuing with the previous posts’ work on expanding Bayes via AIT considerations, the non-continuous changes to a group of scientific theories that arrive with new theories or data require some better model than just adjusting priors. How exactly does coherence play a part in theory formation? If we treat each theory as a binary string that encodes a Turing machine, then the best theory, inductively, is the shortest machine that accepts the data. But we know that there is no machine that can compute that shortest machine, so there needs to be an algorithm that searches through the state space to try to locate the minimal machine. Meanwhile, the data may be varying and the machine may need to incorporate other machines that help improve the coverage of the original machine or are driven by other factors, as Carroll points out:

We use our taste, lessons from experience, and what we know about the rest of physics to help guide us in hopefully productive directions.

The search algorithm is clearly not just brute force in examining every micro variation in the consequences of changing bits in the machine. Instead, large reusable blocks of subroutines get reparameterized or reused with variation.… Read the rest

Free Will and Algorithmic Information Theory (Part II)

Bad monkey

So we get some mild form of source determinism out of Algorithmic Information Complexity (AIC), but we haven’t addressed the form of free will that deals with moral culpability at all. That free will requires that we, as moral agents, are capable of making choices that have moral consequences. Another way of saying it is that given the same circumstances we could have done otherwise. After all, all we have is a series of if/then statements that must be implemented in wetware and they still respond to known stimuli in deterministic ways. Just responding in model-predictable ways to new stimuli doesn’t amount directly to making choices.

Let’s expand the problem a bit, however. Instead of a lock-and-key recognition of integer “foodstuffs” we have uncertain patterns of foodstuffs and fallible recognition systems. Suddenly we have a probability problem with P(food|n) [or even P(food|q(n)) where q is some perception function] governed by Bayesian statistics. Clearly we expect evolution to optimize towards better models, though we know that all kinds of historical and physical contingencies may derail perfect optimization. Still, if we did have perfect optimization, we know what that would look like for certain types of statistical patterns.

What is an optimal induction machine? AIC and variants have been used to define that machine. First, we have Solomonoff induction from around 1960. But we also have Jorma Rissanen’s Minimum Description Length (MDL) theory from 1978 that casts the problem more in terms of continuous distributions. Variants are available, too, from Minimum Message Length, to Akaike’s Information Criterion (AIC, confusingly again), Bayesian Information Criterion (BIC), and on to Structural Risk Minimization via Vapnik-Chervonenkis learning theory.

All of these theories involve some kind of trade-off between model parameters, the relative complexity of model parameters, and the success of the model on the trained exemplars.… Read the rest

Free Will and Algorithmic Information Theory

I was recently looking for examples of applications of algorithmic information theory, also commonly called algorithmic information complexity (AIC). After all, for a theory to be sound is one thing, but when it is sound and valuable it moves to another level. So, first, let’s review the broad outline of AIC. AIC begins with the problem of randomness, specifically random strings of 0s and 1s. We can readily see that given any sort of encoding in any base, strings of characters can be reduced to a binary sequence. Likewise integers.

Now, AIC states that there are often many Turing machines that could generate a given string and, since we can represent those machines also as a bit sequence, there is at least one machine that has the shortest bit sequence while still producing the target string. In fact, if the shortest machine is as long or a bit longer (given some machine encoding requirements), then the string is said to be AIC random. In other words, no compression of the string is possible.

Moreover, we can generalize this generator machine idea to claim that given some set of strings that represent the data of a given phenomena (let’s say natural occurrences), the smallest generator machine that covers all the data is a “theoretical model” of the data and the underlying phenomena. An interesting outcome of this theory is that it can be shown that there is, in fact, no algorithm (or meta-machine) that can find the smallest generator for any given sequence. This is related to Turing Incompleteness.

In terms of applications, Gregory Chaitin, who is one of the originators of the core ideas of AIC, has proposed that the theory sheds light on questions of meta-mathematics and specifically that it demonstrates that mathematics is a quasi-empirical pursuit capable of producing new methods rather than being idealistically derived from analytic first-principles.… Read the rest