Randomness and Meaning

The impossibility of the Chinese Room has implications across the board for understanding what meaning means. Mark Walker’s paper “On the Intertranslatability of all Natural Languages” describes how the translation of words and phrases may be achieved:

  1. Through a simple correspondence scheme (word for word)
  2. Through “syntactic” expansion of the languages to accommodate concepts that have no obvious equivalence (“optometrist” => “doctor for eye problems”, etc.)
  3. Through incorporation of foreign words and phrases as “loan words”
  4. Through “semantic” expansion where the foreign word is defined through its coherence within a larger knowledge network.

An example for (4) is the word “lepton” where many languages do not have a corresponding concept and, in fact, the concept is dependent on a bulwark of advanced concepts from particle physics. There may be no way to create a superposition of the meanings of other words using (2) to adequately handle “lepton.”

These problems present again for trying to understand how children acquire meaning in learning a language. As Walker points out, language learning for a second language must involve the same kinds of steps as learning translations, so any simple correspondence theory has to be supplemented.

So how do we make adequate judgments about meanings and so rapidly learn words, often initially with a course granularity but later with increasingly sharp levels of focus? What procedure is required for expanding correspondence theories to operate in larger networks? Methods like Latent Semantic Analysis and Random Indexing show how this can be achieved in ways that are illuminating about human cognition. In each case, the methods provide insights into how relatively simple transformations of terms and their occurrence contexts can be viewed as providing a form of “triangulation” about the meaning of words.… Read the rest

On the Soul-Eyes of Polar Bears

I sometimes reference a computational linguistics factoid that appears to be now lost in the mists of early DoD Tipster program research: Chinese linguists only agree on the segmentation of texts into words about 80% of the time. We can find some qualitative agreement on the problematic nature of the task, but the 80% is widely smeared out among the references that I can now find. It should be no real surprise, though, because even English with white-space tokenization resists easy characterization of words versus phrases: “New York” and “New York City” are almost words in themselves, though just given white-space tokenization are also phrases. Phrases lift out with common and distinct usage, however, and become more than the sum of their parts; it would be ridiculously noisy to match a search for “York” against “New York” because no one in the modern world attaches semantic significance to the “York” part of the phrase. It exists as a whole and the nature of the parts has dissolved against this wholism.

John Searle’s Chinese Room argument came up again today. My son was waxing, as he does, in a discussion about mathematics and order, and suggested a poverty of our considerations of the world as being purely and completely natural. He meant in the sense of “materialism” and “naturalism” meaning that there are no mystical or magical elements to the world in a metaphysical sense. I argued that there may nonetheless be something that is different and indescribable by simple naturalistic calculi: there may be qualia. It led, in turn, to a qualification of what is unique about the human experience and hence on to Searle’s Chinese Room.

And what happens in the Chinese Room?… Read the rest

Teleology, Chapter 5

Harry spent most of that summer involved in the Santa Fe Sangre de Cristo Church, first with the church summer camp, then with the youth group. He seemed happy and spent the evenings text messaging with his new friends. I was jealous in a way, but refused to let it show too much. Thursdays he was picked up by the church van and went to watch movies in a recreation center somewhere. I looked out one afternoon as the van arrived and could see Sarah’s bright hair shining through the high back window of the van.

Mom explained that they seemed to be evangelical, meaning that they liked to bring as many new worshippers into the religion as possible through outreach and activities. Harry didn’t talk much about his experiences. He was too much in the thick of things to be concerned with my opinions, I think, and snide comments were brushed aside with a beaming smile and a wave. “You just don’t understand,” Harry would dismissively tell me.

I was reading so much that Mom would often demand that I get out of the house on weekend evenings after she had encountered me splayed on the couch straight through lunch and into the shifting evening sunlight passing through the high windows of our thick-walled adobe. I would walk then, often for hours, snaking up the arroyos towards the mountains, then wend my way back down, traipsing through the thick sand until it was past dinner time.

It was during this time period that I read cyberpunk authors and became intrigued with the idea that someday, one day, perhaps computing machines would “wake up” and start to think on their own.… Read the rest

On the Non-Simulation of Human Intelligence

There is a curious dilemma that pervades much machine learning research. The solutions that we are trying to devise are supposed to minimize behavioral error by formulating the best possible model (or collection of competing models). This is also the assumption of evolutionary optimization, whether natural or artificial: optimality is the key to efficiently outcompeting alternative structures, alternative alleles, and alternative conceptual models. The dilemma is whether such optimality is applicable to the notoriously error prone, conceptual flexible, and inefficient reasoning of people. In other words, is machine learning at all like human learning?

I came across a paper called “Multi-Armed Bandit Bayesian Decision Making” while trying to understand what Ted Dunning is planning to talk about at the Big Data Science Meetup at SGI in Fremont, CA a week from Saturday (I’ll be talking, as well) that has a remarkable admission concerning this point:

Human behaviour is after all heavily influenced by emotions, values, culture and genetics; as agents operating in a decentralised system humans are notoriously bad at coordination. It is this fact that motivates us to develop systems that do coordinate well and that operate outside the realms of emotional biasing. We use Bayesian Probability Theory to build these systems specifically because we regard it as common sense expressed mathematically, or rather `the right thing to do’.

The authors continue on to suggest that therefore such systems should instead be seen as corrective assistants for the limitations of human cognitive processes! Machines can put the rational back into reasoned decision-making. But that is really not what machine learning is used for today. Instead, machine learning is used where human decision-making processes are unavailable due to the physical limitations of including humans “in the loop,” or the scale of the data involved, or the tediousness of the tasks at hand.… Read the rest

Eusociality, Errors, and Behavioral Plasticity

I encountered an error in E.O. Wilson’s The Social Conquest of Earth where the authors intended to assert an alternative to “kin selection” but instead repeated “multilevel selection,” which is precisely what the authors wanted to draw a distinction with. I am sympathetic, however, if for no other reason than I keep finding errors and issues with my own books and papers.

The critical technical discussion from Nature concerning the topic is available here. As technical discussion, the issues debated are fraught with details like how halictid bees appear to live socially, but are in fact solitary animals that co-exist in tunnel arrangements.

Despite the focus on “spring-loaded traits” as determiners for haplodiploid animals like bees and wasps, the problem of big-brained behavioral plasticity keeps coming up in Wilson’s book. Humanity is a pinnacle because of taming fire, because of the relative levels of energy available in animal flesh versus plant matter, and because of our ability to outrun prey over long distances (yes, our identity emerges from marathon running). But these are solutions that correlate with the rapid growth of our craniums.

So if behavioral plasticity is so very central to who we are, we are faced with an awfully complex problem in trying to simulate that behavior. We can expect that there must be phalanxes of genes involved in setting our developmental path (our nature and the substrate for our nurture). We should, indeed, expect that almost no cognitive capacity is governed by a small set of genes, and that all the relevant genes work in networks through polygeny, epistasis, and related effects (pleiotropy). And we can expect no easy answers as a result, except to assert that AI is exactly as hard as we should have expected, and progress will be inevitably slow in understanding the mind, the brain, and the way we interact.… Read the rest

From Ethics to Hypercomputation

Toby Ord of Giving What We Can has other interests, including ones that connect back to Solomonoff inference and algorithmic information theory. Specifically, Ord worked earlier on topics related to hypercomputation or, more simply put, the notion that there may be computational systems that exceed the capabilities of Turing Machines.

Turing Machines are abstract computers that can compute logical functions, but the question that has dominated theoretical computer science is what is computable and what is incomputable. The Kolmogorov Complexity of a string is the minimal specification needed to compute the string given a certain computational specification (a program). And the Kolmogorov Complexity is incomputable.  Yet, a compact representation is a minimalist model that can, in turn, lead to optimal future prediction of the underlying generator.

Wouldn’t it be astonishing if there were, in fact, computational systems that exceeded the limits of computability? That’s what Ord’s work set-out to investigate, though there have been detractors.… Read the rest

Radical Triangulation

Donald Davidson argued that descriptive theories of semantics suffered from untenable complications that could, in turn, be solved by a holistic theory of meaning. Holism, in this sense, is due to the dependency of words and phrases as part of a complex linguistic interchange. He proposed “triangulation” as a solution, where we zero-in on a tentatively held belief about a word based on other beliefs about oneself, about others, and about the world we think we know.

This seems daringly obvious, but it is merely the starting point of the hard work of what mechanisms and steps are involved in fixing the meaning of words through triangulation. There are certainly some predispositions that are innate and fit nicely with triangulation. These are subsumed under The Principle of Charity and even the notion of the Intentional Stance in how we regard others like us.

Fixing meaning via model-making has some curious results. The language used to discuss aesthetics and art tends to borrow from other fields (“The narrative of the painting,” “The functional grammar of the architecture.”) Religious and spiritual terminology often has extremely porous models: I recently listened to Episcopalians discuss the meaning of “grace” for almost an hour with great glee but almost no progress; it was the belief that they were discussing something of ineffable greatness that was moving to them. Even seemingly simple scientific ideas become elaborately complex for both children and adults: we begin with atoms as billiard balls that mutate into mini solar systems that become vibrating clouds of probabilistic wave-particles around groups of properties in energetic suspension by virtual particle exchange.

Can we apply more formal models to the task of staking out this method of triangulation?… Read the rest

Learning around the Non Sequiturs

If Solomonoff Induction and its conceptual neighbors have not yet found application in enhancing human reasoning, there are definitely areas where they have potential value.  Automatic, unsupervised learning of sequential patterns is an intriguing area of application. It also fits closely with the sequence inferencing problem that is at the heart of algorithmic information theory.

Pragmatically, the problem of how children learn the interchangeability of words that is the basic operation of grammaticality is one area where this kind of system might be useful. Given a sequence of words or symbols, what sort of information is available for figuring out the grammatical groupings? Not much beyond memories of repetitions, often inferred implicitly.

Could we apply some variant of Solomonoff Induction at this point? Recall that we want to find the most compact explanation for the observed symbol stream. Recall also that the form of the explanation is a computer program of some sort that consists of logical functions. It turns out that creating a program that, for every possible sequence, finds the absolutely most compact program is uncomputable. The notion of what is “uncomputable” (or incomputable) is a mathematical result that has to do with how many different potential programs must be investigated to try to find the shortest one. If that number grows faster than the length of a program, it becomes uncomputable. Being uncomputable is not a death sentence, however. We can come up with approximate methods that try to follow the same procedure because any method that incrementally compresses the explanatory program will get closer to the hypothetical best program.

Sequitur by Nevill-Manning and Witten is an example of a procedure that approximates Algorithmic Information Theory optimization for string sequences.… Read the rest

Solomonoff Induction, Truth, and Theism

LukeProg of CommonSenseAtheism fame created a bit of a row when he declared that Solomonoff Induction largely rules out theism, continuing on to expand on the theme:

If I want to pull somebody away from magical thinking, I don’t need to mention atheism. Instead, I teach them Kolmogorov complexity and Bayesian updating. I show them the many ways our minds trick us. I show them the detailed neuroscience of human decision-making. I show them that we can see (in the brain) a behavior being selected up to 10 seconds before a person is consciously aware of ‘making’ that decision. I explain timelessness.

There were several reasons for the CSA community to get riled up about these statements and they took on several different forms:

  • The focus on Solomonoff Induction/Kolmogorov Complexity is obscurantist in using radical technical terminology.
  • The author is ignoring deductive arguments that support theist claims.
  • The author has joined a cult.
  • Inductive claims based on Solomonoff/Kolmogorov are no different from Reasoning to the Best Explanation.

I think all of these critiques are partially valid, though I don’t think there are any good reasons for thinking theism is true, but the fourth one (which I contributed) was a personal realization for me. Though I have been fascinated with the topics related to Kolmogorov since the early 90s, I don’t think they are directly applicable to the topic of theism/atheism.  Whether we are discussing the historical validity of Biblical claims or the logical consistency of extensions to notions of omnipotence or omniscience, I can’t think of a way that these highly mathematical concepts have direct application.

But what are we talking about? Solomonoff Induction, Kolmogorov Complexity, Minimum Description Length, Algorithmic Information Theory, and related ideas are formalizations of the idea of William of Occam (variously Ockham) known as Occam’s Razor that given multiple explanations of a given phenomena, one should prefer the simpler explanation.… Read the rest

Simulated Experimental Morality

I’m deep in Steven Pinker’s The Better Angels of Nature: Why Violence Has Declined. It’s also only about the third book I’ve tried to read exclusively on the iPad, but I am finally getting used to the platform. The core thesis of Pinker’s book is something that I have been experimentally testing on people for several years: our moral facilities and decision-making are gradually improving. For Pinker, the thesis is built up elaborately from basic estimates of death rates due to war and homicide between non-state societies and state societies. It comes with an uncomfortable inversion of the nobility of the savage mind: primitive people had a lot to fight about and often did.

My first contact with the notion that morality is changing and improving was with Richard Dawkin’s observation in The God Delusion that most modern Westerners feel very uncomfortable with the fire bombing of Tokyo in World War II, the saturation bombing of Hanoi, nuclear attack against civilian populations, or treating people inhumanely based on race or ethnicity. Yet that wasn’t the case just decades ago. More moral drift can be seen in changing sentiments concerning the rights of gay people to marry. Experimentally, then, I would ask, over dinner or conversation, about simple moral trolley experiments and then move on to ask whether anyone would condone nuclear attack against civilian populations. There is always a first response of “no” to the latter, which reflects a gut moral sentiment, though a few people have agreed that it may be “permissible” (to use the language of these kinds of dilemmas) in response to a similar attack and when there may be “command and control assets” mixed into the attack area.… Read the rest