Bravery and Restraint

In 1997, shortly after getting married and buying our first house, I was invited to travel to Japan and spend a little over a month researching Japanese-Chinese machine translation under a grant from the Japanese Ministry of Education. It was a disorienting experience, like most non-Japanese find Japan, and the hours spent studying my translation guide helped me very little. In the mornings I would jog through downtown, around the canals, and past the temples. Days were spent writing and optimizing statistical matching algorithms for lining up runs of characters that I didn’t understand in an early incarnation of the same approach currently used in Google Translate.

I, of course, visited the Peace Memorial Park several times and toured the museum there, ultimately purchasing a slim volume of recollections from the day the bomb fell that was written in Japanese and English on facing pages. There was also one thing that struck me and I later inquired about to a Japan expert who worked in the Intelligence Community: the narrative presented in the museum was that the Japanese commoner had little understanding of the war effort; they were victims of the emperor and the elite classes. It was a moral distancing that resonated with similar arguments about the German volk being non-complicit in the Holocaust, and an argument that I found distasteful.

With this background, then, I was intrigued when I discovered that the father of my new boss wrote a memoir on being perhaps the first Westerner to enter Hiroshima following the dropping of the atomic bomb. Kenneth Harrison’s book, The Brave Japanese, was originally published in 1966, then republished in 1982 under The Road to Hiroshima due, in part, to the controversy in Australia over ascribing bravery to the Japanese.… Read the rest

Sparse Grokking

Jeff Hawkins of Palm fame shows up in the New York Times hawking his Grok for Big Data predictions. Interestingly, if one drills down into the details of Grok, we see once again that randomized sparse representations are the core of the system. That is, if we assign symbols random representational vectors that are sparse, we can sum the vectors for co-occurring symbols and, following J.R. Firth’s pithy “words shall be known by the company that they keep” start to develop a theory of meaning that would not offend Wittgenstein.

Is there anything new to Hawkins’ effort? For certain types of time-series prediction, the approach parallels artificial neural network designs, replacing the complexity of shifting, multi-epoch training regimens that, in effect, build the high-dimensional distances between co-occurring events by gradually moving time-correlated data together and uncorrelated data apart with an end-run around all the computational complexity. But then there is Random Indexing, which I’ve previously discussed, here. If one restricts Random Indexing to operating on temporal patterns, or on spatial patterns, then the results start to look like Numenta’s offering.

While there is a bit of opportunism in Hawkins latching onto Big Data to promote an application of methods he has been working on for years, there are very real opportunities for trying to mine leading indicators to help with everything from ecommerce to research and development. Many flowers will bloom, grok, die, and be reborn.… Read the rest

Bats and Belfries

Thomas Nagel proposes a radical form of skepticism in his new book, Minds and Cosmos, continuing his trajectory through subjective experience and moral realism first began with bats zigging and zagging among the homunculi of dualism reimagined in the form of qualia. The skepticism involves disputing materialistic explanations and proposing, instead, that teleological ones of an unspecified form will likely apply, for how else could his subtitle that paints the “Neo-Darwinian Concept of Nature” as likely false hold true?

Nagel is searching for a non-religious explanation, of course, because just enervating nature through fiat is hardly an explanation at all; any sort of powerful, non-human entelechy could be gaming us and the universe in a non-coherent fashion. But what parameters might support his argument? Since he apparently requires a “significant likelihood” argument to hold sway in support of the origins of life, for instance, we might imagine what kind of thinking could result in highly likely outcomes that begin with inanimate matter and lead to goal-directed behavior while supporting a significant likelihood of that outcome. The parameters might involve the conscious coordination of the events leading towards the emergence of goal-directed life, thus presupposing a consciousness that is not our own. We are back then to our non-human entelechy looming like an alien or like a strange creator deity (which is not desirable to Nagel). We might also consider the possibility that there are properties to the universe itself that result in self-organization and that either we don’t yet know or that we are only beginning to understand. Elliot Sober’s critique suggests that the 2nd Law of Thermodynamics results in what I might call “patterned” behavior while not becoming “goal-directed” per se.… Read the rest

Pressing Snobs into Hell

Paul Vitanyi has been a deep advocate for Kolmogorov complexity for many years. His book with Ming Li, An Introduction to Kolmogorov Complexity and Its Applications, remains on my book shelf (and was a bit of an investment in grad school).

I came across a rather interesting paper by Vitanyi with Rudi Cilibrasi called “Clustering by Compression” that illustrates perhaps more easily and clearly than almost any other recent work the tight connections between meaning, repetition, and informational structure. Rather than describing the paper, however, I wanted to conduct an experiment that demonstrates their results. To do this, I asked the question: are the writings of Dante more similar to other writings of Dante than to Thackeray? And is the same true of Thackeray relative to Dante?

Now, we could pursue these questions at many different levels. We might ask scholars, well-versed in the works of each, to compare and contrast the two authors. They might invoke cultural factors, the memes of their respective eras, and their writing styles. Ultimately, though, the scholars would have to get down to some textual analysis, looking at the words on the page. And in so doing, they would draw distinctions by lifting features of the text, comparing and contrasting grammatical choices, word choices, and other basic elements of the prose and poetry on the page. We might very well be able to take parts of the knowledge of those experts and distill it into some kind of a logical procedure or algorithm that would parse the texts and draw distinctions based on the distributions of words and other structural cues. If asked, we might say that a similar method might work for the so-called language of life, DNA, but that it would require a different kind of background knowledge to build the analysis, much less create an algorithm to perform the same task.… Read the rest

Fish eating fish eating fish

Decompressing in NorCal following a vibrant Hadoop World. More press mentions:

· Big Data, Big News: 10 Things To See At Hadoop World, CRN, October 23, 2012 – (Circulation 53,397)

· Quest Software Announces Hadoop-Centric Software Analytics, CloudNewsDaily, October 23, 2012-coverage of Hadoop product announcements.

· Quest Launches New Analytics Software for Hadoop, SiliconANGLE, October 23, 2012- coverage of Hadoop Product.

· Continuing its M&A software push, Dell moves into ‘big data’ analytics via Kitenga buy, 451 Research

· Cisco Updates Schedule to Automate Hadoop Big Data Analysis Systems, Eweek, October 24, 2012- mention of Kitenga product announcement at Hadoop. (Circulation 196,157)

· Quest Launches New Analytics Software for Hadoop, DABBC, October 24, 2012

And what about fish? Dell == Big Fish, Quest == Medium Fish, Kitenga == Happy Minnow.… Read the rest

Dell Acquires Kitenga

Dell Inc. : Quest Software Expands Its Big Data Solution with New Hadoop-Centric Software Capabilities for Business Analytics

10/23/2012| 08:05am US/Eastern

  • Complete solution includes application development, data replication, and data analysis

Hadoop World 2012-Quest Software, Inc. (now part of Dell) announced three significant product releases today aimed at helping customers more quickly adopt Hadoop and exploit their Big Data. When used together, the three products offer a complete solution that addresses the greatest challenge with Hadoop: the shortage of technical and analytical skills needed to gain meaningful business insight from massive volumes of captured data. Quest builds on its long history in data and database management to open the world of Big Data to more than just the data scientist.

News Facts:

  • Kitenga Analytics: Based on the recent acquisition of Kitenga, Quest Software now enables customers to analyze structured, semi-structured and unstructured data stored in Hadoop. Available immediately, Kitenga Analytics delivers sophisticated capabilities, including text search, machine learning, and advanced visualizations, all from an easy-to-use interface that does not require understanding of complex programming or the Hadoop stack itself. With Kitenga Analytics and the Quest Toad®Business Intelligence Suite, an organization has a complete self-service analysis environment that empowers business and systems analysts across a variety of backgrounds and job roles.
More:

http://www.4-traders.com/DELL-INC-4867/news/Dell-Inc-Quest-Software-Expands-Its-Big-Data-Solution-with-New-Hadoop-Centric-Software-Capabiliti-15415359/Read the rest

Signals and Noise: Chapter 15 (Synaesthesia)

The drift from daylight into twilight held an anxiety for Zach. There was a liquescent feeling to the air that was a result of the luminous ocean, the cars, and the windows of the coastal homes. The morning was much bolder in its transition—less lackadaisical—because the coastal range blocked the light into a striated glow until finally rolling over town in full heat, bearing down on the fogbank that stretched out to the south like twirling cotton candy. He woke up scared in a way that he rarely ever did. There had been days when he awoke in a full flush, bounding out to the living room to peer out through the blinds, marveling that the FBI had not yet arrived, but there had always been a mischievous edge to his fears. If he had been arrested, taken in, interrogated, it was all part of the stripes associated with his own actions. This time was different for Zach. He was scared that there was something else going on that he did not understand, and he was not at all used to not understanding or, at least, thinking he understood.

The online universe had not changed and PoorGore was not back in The Spinner’s miniverse. He checked in on the Idaho papers, narrowing to the southwest corner of the state, watching for anomalies. Pollution, grazing rights, indigenous casinos and their impacts, car dealerships going under, property taxes—it was all normal for the time being except that PoorGore had vanished and nothing significant had happened. Zach’s mental math suggested he could be anywhere in the United States given the elapsed time since PoorGore’s last post. He peered at FC’s house from space again, but the satellite imagery had not changed.… Read the rest

Intelligence versus Motivation

Nick Bostrom adds to the dialog on desire, intelligence, and intentionality with his recent paper, The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents. The argument is largely a deconstruction of the general assumption that there is somehow an inexorable linkage between intelligence and moral goodness. Indeed, he even proposes that intelligence and motivation are essentially orthogonal (“The Orthogonality Thesis”) but that there may be a particular subset of possible trajectories towards any goal that are common (self-preservation, etc.) The latter is scoped by his “instrumental convergence thesis” where there might be convergences towards central tenets that look an awful lot like the vagaries of human moral sentiments. But they remain vagaries and should not be taken to mean that advanced artificial agents will act in a predictable manner.… Read the rest