Inching Towards Shannon’s Oblivion

SkynetFollowing Bill Joy’s concerns over the future world of nanotechnology, biological engineering, and robotics in 2000’s Why the Future Doesn’t Need Us, it has become fashionable to worry over “existential threats” to humanity. Nuclear power and weapons used to be dreadful enough, and clearly remain in the top five, but these rapidly developing technologies, asteroids, and global climate change have joined Oppenheimer’s misquoted “destroyer of all things” in portending our doom. Here’s Max Tegmark, Stephen Hawking, and others in Huffington Post warning again about artificial intelligence:

One can imagine such technology outsmarting financial markets, out-inventing human researchers, out-manipulating human leaders, and developing weapons we cannot even understand. Whereas the short-term impact of AI depends on who controls it, the long-term impact depends on whether it can be controlled at all.

I almost always begin my public talks on Big Data and intelligent systems with a presentation on industrial revolutions that progresses through Robert Gordon’s phases and then highlights Paul Krugman’s argument that Big Data and the intelligent systems improvements we are seeing potentially represent a next industrial revolution. I am usually less enthusiastic about the timeline than nonspecialists, but after giving a talk at PASS Business Analytics Friday in San Jose, I stuck around to listen in on a highly technical talk concerning statistical regularization and deep learning and I found myself enthused about the topic once again. Deep learning is using artificial neural networks to classify information, but is distinct from traditional ANNs in that the systems are pre-trained using auto-encoders to have a general knowledge about the data domain. To be clear, though, most of the problems that have been tackled are “subsymbolic” for image recognition and speech problems.… Read the rest

Parsimonious Portmanteaus

portmanteauMeaning is a problem. We think we might know what something means but we keep being surprised by the facts, research, and logical difficulties that surround the notion of meaning. Putnam’s Representation and Reality runs through a few different ways of thinking about meaning, though without reaching any definitive conclusions beyond what meaning can’t be.

Children are a useful touchstone concerning meaning because we know that they acquire linguistic skills and consequently at least an operational understanding of meaning. And how they do so is rather interesting: first, presume that whole objects are the first topics for naming; next, assume that syntactic differences lead to semantic differences (“the dog” refers to the class of dogs while “Fido” refers to the instance); finally, prefer that linguistic differences point to semantic differences. Paul Bloom slices and dices the research in his Précis of How Children Learn the Meanings of Words, calling into question many core assumptions about the learning of words and meaning.

These preferences become useful if we want to try to formulate an algorithm that assigns meaning to objects or groups of objects. Probabilistic Latent Semantic Analysis, for example, assumes that words are signals from underlying probabilistic topic models and then derives those models by estimating all of the probabilities from the available signals. The outcome lacks labels, however: the “meaning” is expressed purely in terms of co-occurrences of terms. Reconciling an approach like PLSA with the observations about children’s meaning acquisition presents some difficulties. The process seems too slow, for example, which was always a complaint about connectionist architectures of artificial neural networks as well. As Bloom points out, kids don’t make many errors concerning meaning and when they do, they rapidly compensate.… Read the rest

In Like Flynn

The exceptionally interesting James Flynn explains the cognitive history of the past century and what it means in terms of human intelligence in this TED talk:

What does the future hold? While we might decry the “twitch” generation and their inundation by social media, gaming stimulation, and instant interpersonal engagement, the slowing observed in the Flynn Effect might be getting ready for another ramp-up over the next 100 years.

Perhaps most intriguing is the discussion of the ability to think in terms of hypotheticals as a a core component of ethical reasoning. Ethics is about gaming outcomes and also about empathizing with others. The influence of media as a delivery mechanism for narratives about others emerged just as those changes in cognitive capabilities were beginning to mature in the 20th Century. Widespread media had a compounding effect on the core abstract thinking capacity, and with the expansion of smartphones and informational flow, we may only have a few generations to go before the necessary ingredients for good ethical reasoning are widespread even in hard-to-reach areas of the world.… Read the rest

Contingency and Irreducibility

JaredTarbell2Thomas Nagel returns to defend his doubt concerning the completeness—if not the efficacy—of materialism in the explanation of mental phenomena in the New York Times. He quickly lays out the possibilities:

  1. Consciousness is an easy product of neurophysiological processes
  2. Consciousness is an illusion
  3. Consciousness is a fluke side-effect of other processes
  4. Consciousness is a divine property supervened on the physical world

Nagel arrives at a conclusion that all four are incorrect and that a naturalistic explanation is possible that isn’t “merely” (1), but that is at least (1), yet something more. I previously commented on the argument, here, but the refinement of the specifications requires a more targeted response.

Let’s call Nagel’s new perspective Theory 1+ for simplicity. What form might 1+ take on? For Nagel, the notion seems to be a combination of Chalmers-style qualia combined with a deep appreciation for the contingencies that factor into the personal evolution of individual consciousness. The latter is certainly redundant in that individuality must be absolutely tied to personal experiences and narratives.

We might be able to get some traction on this concept by looking to biological evolution, though “ontogeny recapitulates phylogeny” is about as close as we can get to the topic because any kind of evolutionary psychology must be looking for patterns that reinforce the interpretation of basic aspects of cognitive evolution (sex, reproduction, etc.) rather than explore the more numinous aspects of conscious development. So we might instead look for parallel theories that focus on the uniqueness of outcomes, that reify the temporal evolution without reference to controlling biology, and we get to ideas like uncomputability as a backstop. More specifically, we can explore ideas like computational irreducibility to support the development of Nagel’s new theory; insofar as the environment lapses towards weak predictability, a consciousness that self-observes, regulates, and builds many complex models and metamodels is superior to those that do not.… Read the rest

Singularity and its Discontents

Kimmel botIf a machine-based process can outperform a human being is it significant? That weighty question hung in the background as I reviewed Jürgen Schmidhuber’s work on traffic sign classification. Similar results have emerged from IBM’s Watson competition and even on the TOEFL test. In each case, machines beat people.

But is that fact significant? There are a couple of ways we can look at these kinds of comparisons. First, we can draw analogies to other capabilities that were not accessible by mechanical aid and show that the fact that they outperformed humans was not overly profound. The wheel quickly outperformed human legs for moving heavy objects. The cup outperformed the hands for drinking water. This then invites the realization that the extension of these physical comparisons leads to extraordinary juxtapositions: the airline really outperformed human legs for transport, etc. And this, in turn, justifies the claim that since we are now just outperforming human mental processes, we can only expect exponential improvements moving forward.

But this may be a category mistake in more than the obvious differentiator of the mental and the physical. Instead, the category mismatch is between levels of complexity. The number of parts in a Boeing 747 is 6 million versus one moving human as the baseline (we could enumerate the cells and organelles, etc., but then we would need to enumerate the crystal lattices of the aircraft steel, so that level of granularity is a wash). The number of memory addresses in a big server computer is 64 x 10^9 or higher, with disk storage in the TBs (10^12). Meanwhile, the human brain has 100 x 10^9 neurons and 10^14 connections. So, with just 2 orders of magnitude between computers and brains versus 6 between humans and planes, we find ourselves approaching Kurzweil’s argument that we have to wait until 2040.… Read the rest

Curiouser and Curiouser

georgeJürgen Schmidhuber’s work on algorithmic information theory and curiosity is worth a few takes, if not more, for the researcher has done something that is both flawed and rather brilliant at the same time. The flaws emerge when we start to look deeply into the motivations for ideas like beauty (is symmetry and noncomplex encoding enough to explain sexual attraction? Well-understood evolutionary psychology is probably a better bet), but the core of his argument is worth considering.

If induction is an essential component of learning (and we might suppose it is for argument’s sake), then why continue to examine different parameterizations of possible models for induction? Why be creative about how to explain things, like we expect and even idolize of scientists?

So let us assume that induction is explained by the compression of patterns into better and better models using an information theoretic-style approach. Given this, Schmidhuber makes the startling leap that better compression and better models are best achieved by information harvesting behavior that involves finding novelty in the environment. Thus curiosity. Thus the implementation of action in support of ideas.

I proposed a similar model to explain aesthetic preferences for mid-ordered complex systems of notes, brush-strokes, etc. around 1994, but Schmidhuber’s approach has the benefit of not just characterizing the limitations and properties of aesthetic systems, but also justifying them. We find interest because we are programmed to find novelty, and we are programmed to find novelty because we want to optimize our predictive apparatus. The best optimization is actively seeking along the contours of the perceivable (and quantifiable) universe, and isolating the unknown patterns to improve our current model.… Read the rest

Industrial Revolution #4

Paul Krugman at New York Times consumes Robert Gordon’s analysis of economic growth and the role of technology and comes up more hopeful than Gordon. The kernel in Krugman’s hope is that Big Data analytics can provide a shortcut to intelligent machines by bypassing the requirement for specification and programming that was once assumed to be a requirement for artificial intelligence. Instead, we don’t specify but use “data-intensive ways” to achieve a better result. And we might get to IR#4, following Gordon’s taxonomy where IR stands for “industrial revolution.” IR#1 was steam and locomotives  IR#2 was everything up to computers. IR#3 is computers and cell phones and whatnot.

Krugman implies that IR#4 might spur the typical economic consequences of grand technological change, including the massive displacement of workers, but like in previous revolutions it is also assumed that economic growth built from new industries will ultimately eclipse the negatives. This is not new, of course. Robert Anton Wilson argued decades ago for the R.I.C.H. economy (Rising Income through Cybernetic Homeostasis). Wilson may have been on acid, but Krugman wasn’t yet tuned in, man. (A brief aside: the Krugman/Wilson notions probably break down over extraction and agribusiness/land rights issues. If labor is completely replaced by intelligent machines, the land and the ingredients it contains nevertheless remain a bottleneck for economic growth. Look at the global demand for copper and rare earth materials, for instance.)

But why the particular focus on Big Data technologies? Krugman’s hope teeters on the assumption that data-intensive algorithms possess a fundamentally different scale and capacity than human-engineered approaches. Having risen through the computational linguistics and AI community working on data-driven methods for approaching intelligence, I can certainly sympathize with the motivation, but there are really only modest results to report at this time.… Read the rest

Sparse Grokking

Jeff Hawkins of Palm fame shows up in the New York Times hawking his Grok for Big Data predictions. Interestingly, if one drills down into the details of Grok, we see once again that randomized sparse representations are the core of the system. That is, if we assign symbols random representational vectors that are sparse, we can sum the vectors for co-occurring symbols and, following J.R. Firth’s pithy “words shall be known by the company that they keep” start to develop a theory of meaning that would not offend Wittgenstein.

Is there anything new to Hawkins’ effort? For certain types of time-series prediction, the approach parallels artificial neural network designs, replacing the complexity of shifting, multi-epoch training regimens that, in effect, build the high-dimensional distances between co-occurring events by gradually moving time-correlated data together and uncorrelated data apart with an end-run around all the computational complexity. But then there is Random Indexing, which I’ve previously discussed, here. If one restricts Random Indexing to operating on temporal patterns, or on spatial patterns, then the results start to look like Numenta’s offering.

While there is a bit of opportunism in Hawkins latching onto Big Data to promote an application of methods he has been working on for years, there are very real opportunities for trying to mine leading indicators to help with everything from ecommerce to research and development. Many flowers will bloom, grok, die, and be reborn.… Read the rest

Bats and Belfries

Thomas Nagel proposes a radical form of skepticism in his new book, Minds and Cosmos, continuing his trajectory through subjective experience and moral realism first began with bats zigging and zagging among the homunculi of dualism reimagined in the form of qualia. The skepticism involves disputing materialistic explanations and proposing, instead, that teleological ones of an unspecified form will likely apply, for how else could his subtitle that paints the “Neo-Darwinian Concept of Nature” as likely false hold true?

Nagel is searching for a non-religious explanation, of course, because just enervating nature through fiat is hardly an explanation at all; any sort of powerful, non-human entelechy could be gaming us and the universe in a non-coherent fashion. But what parameters might support his argument? Since he apparently requires a “significant likelihood” argument to hold sway in support of the origins of life, for instance, we might imagine what kind of thinking could result in highly likely outcomes that begin with inanimate matter and lead to goal-directed behavior while supporting a significant likelihood of that outcome. The parameters might involve the conscious coordination of the events leading towards the emergence of goal-directed life, thus presupposing a consciousness that is not our own. We are back then to our non-human entelechy looming like an alien or like a strange creator deity (which is not desirable to Nagel). We might also consider the possibility that there are properties to the universe itself that result in self-organization and that either we don’t yet know or that we are only beginning to understand. Elliot Sober’s critique suggests that the 2nd Law of Thermodynamics results in what I might call “patterned” behavior while not becoming “goal-directed” per se.… Read the rest