An Exit to a New Beginning

September 26, 2012 Mark DavisLeave a comment

I am thrilled to note that my business partner and I sold our Big Data analytics startup to a large corporation yesterday. I am currently unemployed but start anew doing the same work on Monday.

Thrilled is almost too tame a word. Ecstatic does better describing the mood around here and the excitement we have over having triumphed in Sili Valley. There are many war stories that we’ve been swapping over the last 24 hours, including how we nearly shut down/rebooted at the start of 2012. But now it is over and we have just a bit of cleanup work left to dissolve the existing business structures and a short vacation to attend to.… Read the rest

Evolutionary Art and Architecture

September 10, 2012 Mark DavisLeave a comment

With every great scientific advance there has been a coordinated series of changes in the Zeitgeist. Evolutionary theory has impacted everything from sociology through to literature, but there are some very sophisticated efforts in the arts that deserve more attention.

John Frazer’s Evolutionary Architecture is a great example. Now available as downloadable PDFs since it is out-of-print, Evolutionary Architecture asks the question, without fully answering it (how could it?), about how evolution-like processes can contribute to the design of structures:

And then there is William Latham’s evolutionary art that explores form derived from generative functions dating to 1989:

And the art extends to functional virtual creatures:

… Read the rest

Universal Artificial Social Intelligence

August 31, 2012 Mark DavisLeave a comment

Continuing to develop the idea that social reasoning adds to Hutter’s Universal Artificial Intelligence model, below is his basic layout for agents and environments:

A few definitions: The Agent (p) is a Turing machine that consists of a working tape and an algorithm that can move the tape left or right, read a symbol from the tape, write a symbol to the tape, and transition through a finite number of internal states as held in a table. That is all that is needed to be a Turing machine and Turing machines can compute like our every day notion of a computer. Formally, there are bounds to what they can compute (for instance, whether any given program consisting of the symbols on the tape will stop at some point or will run forever without stopping (this is the so-called “halting problem“). But it suffices to think of the Turing machine as a general-purpose logical machine in that all of its outputs are determined by a sequence of state changes that follow from the sequence of inputs and transformations expressed in the state table. There is no magic here.

Hutter then couples the agent to a representation of the environment, also expressed by a Turing machine (after all, the environment is likely deterministic), and has the output symbols of the agent consumed by the environment (y) which, in turn, outputs the results of the agent’s interaction with it as a series of rewards (r) and environment signals (x), that are consumed by agent once again.

Where this gets interesting is that the agent is trying to maximize the reward signal which implies that the combined predictive model must convert all the history accumulated at one point in time into an optimal predictor.… Read the rest

Multitudes and the Mathematics of the Individual

August 28, 2012August 29, 2012 Mark DavisLeave a comment

The notion that there is a path from reciprocal altruism to big brains and advanced cognitive capabilities leads us to ask whether we can create “effective” procedures that shed additional light on the suppositions that are involved, and their consequences. Any skepticism about some virulent kind of scientism then gets whisked away by the imposition of a procedure combined with an earnest interest in careful evaluation of the outcomes. That may not be enough, but it is at least a start.

I turn back to Marcus Hutter, Solomonoff, and Chaitin-Kolmogorov at this point. I’ll be primarily referencing Hutter’s Universal Algorithmic Intelligence (A Top-Down Approach) in what follows. And what follows is an attempt to break down how three separate factors related to intelligence can be explained through mathematical modeling. The first and the second are covered in Hutter’s paper, but the third may represent a new contribution, though perhaps an obvious one without the detail work that is needed to provide good support.

First, then, we start with a core requirement of any goal-seeking mechanism: the ability to predict patterns in the environment external to the mechanism. This is well-covered since Solomonoff in the 60s who formalized the implicit arguments in Kolmogorov algorithmic information theory (AIT), and that were subsequently expanded on by Greg Chaitin. In essence, given a range of possible models represented by bit sequences of computational states, the shortest sequence that predicts the observed data is also the optimal predictor for any future data also produced by the underlying generator function. The shortest sequence is not computable, but we can keep searching for shorter programs and come up with unique optimizations for specific data landscapes. And that should sound familiar because it recapitulates Occam’s Razor and, in a subset of cases, Epicurus’ Principle of Multiple Explanations.… Read the rest

Bostrom on the Hardness of Evolving Intelligence

June 19, 2012 Mark DavisLeave a comment

At 38,000 feet somewhere above Missouri, returning from a one day trip to Washington D.C., it is easy to take Nick Bostrom’s point that bird flight is not the end-all of what is possible for airborne objects and mechanical contrivances like airplanes in his paper, How Hard is Artificial Intelligence? Evolutionary Arguments and Selection Effects. His efforts to try to bound and distinguish the evolution of intelligence as either Hard or Not-Hard runs up against significant barriers, however. As a practitioner of the art, finding similarities between a purely physical phenomena like flying and something as complex as human intelligence falls flat for me.

But Bostrom is not taking flying as more than a starting point for arguing that there is an engineer-able possibility for intelligence. And that possibility might be bounded by a number of current and foreseeable limitations, not least of which is that computer simulations of evolution require a certain amount of computing power and representational detail in order to be a sufficient simulation. His conclusion is that we may need as much as another 100 years of improvements in computing technology just to get to a point where we might succeed at a massive-scale evolutionary simulation (I’ll leave to the reader to investigate his additional arguments concerning convergent evolution and observer selection effects).

Bostrom dismisses as pessimistic the assumption that a sufficient simulation would, in fact, require a highly detailed emulation of some significant portion of the real environment and the history of organism-environment interactions:

A skeptic might insist that an abstract environment would be inadequate for the evolution of general intelligence, believing instead that the virtual environment would need to closely resemble the actual biological environment in which our ancestors evolved … However, such extreme pessimism seems unlikely to be well founded; it seems unlikely that the best environment for evolving intelligence is one that mimics nature as closely as possible.

… Read the rest

Semantic Zooming Demo

June 16, 2012 Mark DavisLeave a comment

Semantic Zooming over Hadoop Distributed File Systems (with lenses) from Hadoop Summit 2012:

… Read the rest

Semantic Zooming

June 11, 2012 Mark DavisLeave a comment

I’ve been pushing hard for a demo at Hadoop Summit this week, waking unexpectedly at 5 AM this morning with spherical trigonometry percolating through my head. The topic is “semantic zooming” and it is not a complicated concept to understand because we have a common example that many of us use daily: Google Maps. All the modern, online mapping systems do semantic zooming to a degree when they change the types of information that are displayed on the map depending on the zoom level. Thus, the “semantics” or “meaning” of the displayed information changes with zooming, revealing states, then rivers, then major roads, then minor roads, and then all the way down to local businesses. The goal of semantic zooming is to manage information overload by managing semantics.

In my case, I’m using a semantic zooming interface to apply different types of information visualizations to data resources in a distributed file system (a file system that spans many disk drives in many computers) related to the “big data” technology, Hadoop. A distributed file system can have many data types (numerical data, text, PDFs, log files from web servers, scientific data) and the only way to interact with the data is through a command-line or through fairly simple web-based user interfaces that act like crude file system browsers. Making use of the data in the system, analyzing it, requires running analysis processes on it, then pulling the data out and importing it into other technologies like Excel or business intelligence systems to bind charting and visualization tools to it. With semantic zooming operating directly on the data, however, the structure of the data can be probed directly and the required background processes launch automatically to create new aggregate views of the data.… Read the rest

Randomness and Meaning

June 4, 2012June 4, 2012 Mark Davis1 Comment

The impossibility of the Chinese Room has implications across the board for understanding what meaning means. Mark Walker’s paper “On the Intertranslatability of all Natural Languages” describes how the translation of words and phrases may be achieved:

Through a simple correspondence scheme (word for word)
Through “syntactic” expansion of the languages to accommodate concepts that have no obvious equivalence (“optometrist” => “doctor for eye problems”, etc.)
Through incorporation of foreign words and phrases as “loan words”
Through “semantic” expansion where the foreign word is defined through its coherence within a larger knowledge network.

An example for (4) is the word “lepton” where many languages do not have a corresponding concept and, in fact, the concept is dependent on a bulwark of advanced concepts from particle physics. There may be no way to create a superposition of the meanings of other words using (2) to adequately handle “lepton.”

These problems present again for trying to understand how children acquire meaning in learning a language. As Walker points out, language learning for a second language must involve the same kinds of steps as learning translations, so any simple correspondence theory has to be supplemented.

So how do we make adequate judgments about meanings and so rapidly learn words, often initially with a course granularity but later with increasingly sharp levels of focus? What procedure is required for expanding correspondence theories to operate in larger networks? Methods like Latent Semantic Analysis and Random Indexing show how this can be achieved in ways that are illuminating about human cognition. In each case, the methods provide insights into how relatively simple transformations of terms and their occurrence contexts can be viewed as providing a form of “triangulation” about the meaning of words.… Read the rest

On the Soul-Eyes of Polar Bears

May 29, 2012 Mark DavisLeave a comment

I sometimes reference a computational linguistics factoid that appears to be now lost in the mists of early DoD Tipster program research: Chinese linguists only agree on the segmentation of texts into words about 80% of the time. We can find some qualitative agreement on the problematic nature of the task, but the 80% is widely smeared out among the references that I can now find. It should be no real surprise, though, because even English with white-space tokenization resists easy characterization of words versus phrases: “New York” and “New York City” are almost words in themselves, though just given white-space tokenization are also phrases. Phrases lift out with common and distinct usage, however, and become more than the sum of their parts; it would be ridiculously noisy to match a search for “York” against “New York” because no one in the modern world attaches semantic significance to the “York” part of the phrase. It exists as a whole and the nature of the parts has dissolved against this wholism.

John Searle’s Chinese Room argument came up again today. My son was waxing, as he does, in a discussion about mathematics and order, and suggested a poverty of our considerations of the world as being purely and completely natural. He meant in the sense of “materialism” and “naturalism” meaning that there are no mystical or magical elements to the world in a metaphysical sense. I argued that there may nonetheless be something that is different and indescribable by simple naturalistic calculi: there may be qualia. It led, in turn, to a qualification of what is unique about the human experience and hence on to Searle’s Chinese Room.

And what happens in the Chinese Room?… Read the rest

The Unreasonable Success of Reason

May 21, 2012May 21, 2012 Mark DavisLeave a comment

Math and natural philosophy were discovered several times in human history: Classical Greece, Medieval Islam, Renaissance Europe. Arguably, the latter two were strongly influenced by the former, but even so they built additional explanatory frameworks. Moreover, the explosion that arose from Europe became the Enlightenment and the modern edifice of science and technology

So, on the eve of an eclipse that sufficiently darkened the skies of Northern California, it is worth noting the unreasonable success of reason. The gods are not angry. The spirits are not threatening us over a failure to properly propitiate their symbolic requirements. Instead, the mathematics worked predictively and perfectly to explain a wholly natural phenomenon.

But why should the mathematics work so exceptionally well? It could be otherwise, as Eugene Wigner’s marvelous 1960 paper, The Unreasonable Effectiveness of Mathematics in the Natural Sciences, points out:

All the laws of nature are conditional statements which permit a prediction of some future events on the basis of the knowledge of the present, except that some aspects of the present state of the world, in practice the overwhelming majority of the determinants of the present state of the world, are irrelevant from the point of view of the prediction.

…

A possible explanation of the physicist’s use of mathematics to formulate his laws of nature is that he is a somewhat irresponsible person. As a result, when he finds a connection between two quantities which resembles a connection well-known from mathematics, he will jump at the conclusion that the connection is that discussed in mathematics simply because he does not know of any other similar connection.

Galileo’s rocks fall at the same rates but only provided that they are not unduly flat and light.… Read the rest