Signals and Apophenia

qrcode-distortThe central theme in Signals and Noise is that of the inverse problem and its consequences: given an ocean of data, how does one uncover the true signals hidden in the noise? Is there even such a thing? There’s an obsessive balance between apophenia and modeling somewhere built into our skulls.

The cover art for Signals and Noise reflects those tendencies. There is a QR Code that encodes a passage from the book, and then there is a distortion of the content of the QR Code. The distortion, in turn, creates a compelling image. Is it a fly creeping to the left or a lion’s head tilted to the right?

Yes.

A free hard-cover copy of Signals and Noise to anyone who decodes the QR Code. Post a copy of the text to claim your reward.… Read the rest

Hits and MITS

I just came across the following scan that describes how an MITS Altair 8800B became my first personal computer, journeying from Albuquerque to Las Cruces, New Mexico and getting an EEPROM burner attached to a ribbon cable snaking out through the enameled steel case. The speech synthesizer predated stored digital samples, by the way, so it instead emulated phonemic mixtures generated by digital waveform filtering.

And that young chap on the left would later become my boss, three or four parts removed, at Microsoft. I still have that Altair, too, safely stored away.

MITS Altair Convention

Original PDF Scan:

MITS Computer Convention

 

 … Read the rest

Singularity and its Discontents

Kimmel botIf a machine-based process can outperform a human being is it significant? That weighty question hung in the background as I reviewed Jürgen Schmidhuber’s work on traffic sign classification. Similar results have emerged from IBM’s Watson competition and even on the TOEFL test. In each case, machines beat people.

But is that fact significant? There are a couple of ways we can look at these kinds of comparisons. First, we can draw analogies to other capabilities that were not accessible by mechanical aid and show that the fact that they outperformed humans was not overly profound. The wheel quickly outperformed human legs for moving heavy objects. The cup outperformed the hands for drinking water. This then invites the realization that the extension of these physical comparisons leads to extraordinary juxtapositions: the airline really outperformed human legs for transport, etc. And this, in turn, justifies the claim that since we are now just outperforming human mental processes, we can only expect exponential improvements moving forward.

But this may be a category mistake in more than the obvious differentiator of the mental and the physical. Instead, the category mismatch is between levels of complexity. The number of parts in a Boeing 747 is 6 million versus one moving human as the baseline (we could enumerate the cells and organelles, etc., but then we would need to enumerate the crystal lattices of the aircraft steel, so that level of granularity is a wash). The number of memory addresses in a big server computer is 64 x 10^9 or higher, with disk storage in the TBs (10^12). Meanwhile, the human brain has 100 x 10^9 neurons and 10^14 connections. So, with just 2 orders of magnitude between computers and brains versus 6 between humans and planes, we find ourselves approaching Kurzweil’s argument that we have to wait until 2040.… Read the rest

Methodical Play

imageMy fourteen-year-old interviewed a physicist yesterday. I had the privilege of being home over the weekend and listened in; my travel schedule has lately been brutal, with the only saving grace being moments like right now en route to Chicago when I can collapse into reading and writing for a few whitenoise-washed moments. And the physicist who was once his grandfather said some remarkable things:

  • Physics consists of empirical layers of untruth
  • The scientific method is never used as formulated
  • Schools, while valuable, won’t teach how to be a scientist
  • The institutions of physics don’t support the creativity required to be a scientist

Yet there was no sense of anger or disillusionment in these statements, just a framing of the distinctions between the modern social model surrounding what scientists do and the complex reality of how they really do their work.

The positives were that play is both the essential ingredient and the missing determinant of the real “scientific method.” Mess around, try to explain, mess around some more. And what is all that play getting this remarkable octogenarian? Possible insights into the unification of electromagnetism and the strong nuclear force. The interview journey passed from alignment of quarks to the beams of neutron stars, igniting the imaginations of all the minds on the call.

But if there is no real large-scale method to this madness, what might we conclude about the rationality of the process of science? I would advocate that the algorithmic model of inference is perhaps the best (and least biased) way of approaching the issue of scientific method. By constantly reshuffling the available parameters and testing the compressibility of models, play is indistinguishable from science when the play pivots on best explanation.… Read the rest

Curiouser and Curiouser

georgeJürgen Schmidhuber’s work on algorithmic information theory and curiosity is worth a few takes, if not more, for the researcher has done something that is both flawed and rather brilliant at the same time. The flaws emerge when we start to look deeply into the motivations for ideas like beauty (is symmetry and noncomplex encoding enough to explain sexual attraction? Well-understood evolutionary psychology is probably a better bet), but the core of his argument is worth considering.

If induction is an essential component of learning (and we might suppose it is for argument’s sake), then why continue to examine different parameterizations of possible models for induction? Why be creative about how to explain things, like we expect and even idolize of scientists?

So let us assume that induction is explained by the compression of patterns into better and better models using an information theoretic-style approach. Given this, Schmidhuber makes the startling leap that better compression and better models are best achieved by information harvesting behavior that involves finding novelty in the environment. Thus curiosity. Thus the implementation of action in support of ideas.

I proposed a similar model to explain aesthetic preferences for mid-ordered complex systems of notes, brush-strokes, etc. around 1994, but Schmidhuber’s approach has the benefit of not just characterizing the limitations and properties of aesthetic systems, but also justifying them. We find interest because we are programmed to find novelty, and we are programmed to find novelty because we want to optimize our predictive apparatus. The best optimization is actively seeking along the contours of the perceivable (and quantifiable) universe, and isolating the unknown patterns to improve our current model.… Read the rest

A Paradigm of Guessing

boxesThe most interesting thing I’ve read this week comes from Jurgen Schmidhuber’s paper, Algorithmic Theories of Everything, which should be provocative enough to pique the most jaded of interests. And the quote is from way into the paper:

The first number is 2, the second is 4, the third is 6, the fourth is 8. What is the fifth? The correct answer is “250,” because the nth number is n 5 −5n^4 −15n^3 + 125n^2 −224n+ 120. In certain IQ tests, however, the answer “250” will not yield maximal score, because it does not seem to be the “simplest” answer consistent with the data (compare [73]). And physicists and others favor “simple” explanations of observations.

And this is the beginning and the end of logical positivism. How can we assign truth to inductive judgments without crossing from fact to value, and what should that value system be?… Read the rest

Industrial Revolution #4

Paul Krugman at New York Times consumes Robert Gordon’s analysis of economic growth and the role of technology and comes up more hopeful than Gordon. The kernel in Krugman’s hope is that Big Data analytics can provide a shortcut to intelligent machines by bypassing the requirement for specification and programming that was once assumed to be a requirement for artificial intelligence. Instead, we don’t specify but use “data-intensive ways” to achieve a better result. And we might get to IR#4, following Gordon’s taxonomy where IR stands for “industrial revolution.” IR#1 was steam and locomotives  IR#2 was everything up to computers. IR#3 is computers and cell phones and whatnot.

Krugman implies that IR#4 might spur the typical economic consequences of grand technological change, including the massive displacement of workers, but like in previous revolutions it is also assumed that economic growth built from new industries will ultimately eclipse the negatives. This is not new, of course. Robert Anton Wilson argued decades ago for the R.I.C.H. economy (Rising Income through Cybernetic Homeostasis). Wilson may have been on acid, but Krugman wasn’t yet tuned in, man. (A brief aside: the Krugman/Wilson notions probably break down over extraction and agribusiness/land rights issues. If labor is completely replaced by intelligent machines, the land and the ingredients it contains nevertheless remain a bottleneck for economic growth. Look at the global demand for copper and rare earth materials, for instance.)

But why the particular focus on Big Data technologies? Krugman’s hope teeters on the assumption that data-intensive algorithms possess a fundamentally different scale and capacity than human-engineered approaches. Having risen through the computational linguistics and AI community working on data-driven methods for approaching intelligence, I can certainly sympathize with the motivation, but there are really only modest results to report at this time.… Read the rest

Keep Suspicious and Carry On

I’ve previously argued that it is unlikely that resource-constrained simulations can achieve adequate levels of fidelity to be sufficient for what we observe around us. This argument was a combination of computational irreducibility and assumptions about the complexity of evolutionary trajectories of living beings. There may also be an argument about the observed contingency of the evolutionary process that is an argument against any kind of “intelligent” organizing principle though not against simulation itself.

Leave it to physicists to envision a test of the Bostrom hypothesis that we are living in a computer simulation. Martin Savage and his colleagues look at Quantum Chromodynamic (QCD) theory and current simulation methods for QCD. They conclude that if we are, in fact, living in a simulation, then we might observe specific inconsistencies that arise from finite computing power for the universe as a whole. Those inconsistencies would be observed in looking at the distribution of cosmic ray energies, specifically. Note that if the distribution is not unusual the universe could either be a simulation (just a sophisticated one) or could be a truly physical one (free running and not on another entity’s computational framework). It is only if the distribution is unusual that it might be a simulation.… Read the rest

Sparse Grokking

Jeff Hawkins of Palm fame shows up in the New York Times hawking his Grok for Big Data predictions. Interestingly, if one drills down into the details of Grok, we see once again that randomized sparse representations are the core of the system. That is, if we assign symbols random representational vectors that are sparse, we can sum the vectors for co-occurring symbols and, following J.R. Firth’s pithy “words shall be known by the company that they keep” start to develop a theory of meaning that would not offend Wittgenstein.

Is there anything new to Hawkins’ effort? For certain types of time-series prediction, the approach parallels artificial neural network designs, replacing the complexity of shifting, multi-epoch training regimens that, in effect, build the high-dimensional distances between co-occurring events by gradually moving time-correlated data together and uncorrelated data apart with an end-run around all the computational complexity. But then there is Random Indexing, which I’ve previously discussed, here. If one restricts Random Indexing to operating on temporal patterns, or on spatial patterns, then the results start to look like Numenta’s offering.

While there is a bit of opportunism in Hawkins latching onto Big Data to promote an application of methods he has been working on for years, there are very real opportunities for trying to mine leading indicators to help with everything from ecommerce to research and development. Many flowers will bloom, grok, die, and be reborn.… Read the rest