Ambiguously Slobbering Dogs

I was initially dismissive of this note from Google Research on improving machine translation via Deep Learning Networks by adding in a sentence-level network. My goodness, they’ve rediscovered anaphora and co-reference resolution! Next thing they will try is some kind of network-based slot-filler ontology to carry gender metadata. But their goal was to add a framework to their existing recurrent neural network architecture that would support a weak, sentence-level resolution of translational ambiguities while still allowing the TPU/GPU accelerators they have created to function efficiently. It’s a hack, but one that potentially solves yet another corner of the translation problem and might result in a few percent further improvements in the quality of the translation.

But consider the following sentences:

The dog had the ball. It was covered with slobber.

The dog had the ball. It was thinking about lunch while it played.

In these cases, the anaphora gets resolved by semantics and the resolution seems largely an automatic and subconscious process to us as native speakers. If we had to translate these into a second language, however, we would be able to articulate that there are specific reasons for correctly assigning the “It” to the ball in the first two sentences. Well, it might be possible for the dog to be covered with slobber, but we would guess the sentence writer would intentionally avoid that ambiguity. The second set of sentences could conceivably be ambiguous if, in the broader context, the ball was some intelligent entity controlling the dog. Still, when our guesses are limited to the sentence pairs in isolation we would assign the obvious interpretations. Moreover, we can resolve giant, honking passage-level ambiguities with ease, where the author is showing off in not resolving the co-referents until obscenely late in the text.… Read the rest

Zebras with Machine Guns

I was just rereading some of the literature on Plantinga’s Evolutionary Argument Against Naturalism (EAAN) as a distraction from trying to write too much on ¡Reconquista!, since it looks like I am on a much faster trajectory to finishing the book than I had thought. EAAN is a curious little argument that some have dismissed as a resurgent example of scholastic theology. It has some newer trappings that we see in modern historical method, however, especially in the use Bayes’ Theorem to establish the warrant of beliefs by trying to cast those warrants as probabilities.

A critical part of Plantinga’s argument hinges on the notion that evolutionary processes optimize against behavior and not necessarily belief. Therefore, it is plausible that an individual could hold false beliefs that are nonetheless adaptive. For instance, Plantinga gives the example of a man who desires to be eaten by tigers but always feels hopeless when confronted by a given tiger because he doesn’t feel worthy of that particular tiger, so he runs away and looks for another one. This may seem like a strange conjunction of beliefs and actions that happen to result in the man surviving, but we know from modern psychology that people can form elaborate justifications for perceived events and wild metaphysics to coordinate those justifications.

If that is the case, for Plantinga, the evolutionary consequence is that we should not trust our belief in our reasoning faculties because they are effectively arbitrary. There are dozens of responses to this argument that dissect it from many different dimensions. I’ve previously showcased Branden Fitelson and Elliot Sober’s Plantinga’s Probability Arguments Against Evolutionary Naturalism from 1997, which I think is one of the most complete examinations of the structure of the argument.… Read the rest

The Obsessive Dreyfus-Hawking Conundrum

I’ve been obsessed lately. I was up at 5 A.M. yesterday and drove to Ruidoso to do some hiking (trails T93 to T92, if interested). The San Augustin Pass was desolate as the sun began breaking over, so I inched up into triple digit speeds in the M6. Because that is what the machine is made for. Booming across White Sands Missile Range, I recalled watching base police work with National Park Rangers to chase oryx down the highway while early F117s practiced touch-and-gos at Holloman in the background, and then driving my carpool truck out to the high energy laser site or desert ship to deliver documents.

I settled into Starbucks an hour and a half later and started writing on ¡Reconquista!, cranking out thousands of words before trying to track down the trailhead and starting on my hike. (I would have run the thing but wanted to go to lunch later and didn’t have access to a shower. Neither restaurant nor diners deserve an après-run moi.) And then I was on the trail and I kept stopping and taking plot and dialogue notes, revisiting little vignettes and annotating enhancements that I would later salt in to the main text over lunch. And I kept rummaging through the development of characters, refining and sifting the facts of their lives through different sets of sieves until they took on both a greater valence within the story arc and, often, more comedic value.

I was obsessed and remain so. It is a joyous thing to be in this state, comparable only to working on large-scale software systems when the hours melt away and meals slip as one cranks through problem after problem, building and modulating the subsystems until the units begin to sing together like a chorus.… Read the rest

Tweak, Memory

Artificial Neural Networks (ANNs) were, from early on in their formulation as Threshold Logic Units (TLUs) or Perceptrons, mostly focused on non-sequential decision-making tasks. With the invention of back-propagation training methods, the application to static presentations of data became somewhat fixed as a methodology. During the 90s Support Vector Machines became the rage and then Random Forests and other ensemble approaches held significant mindshare. ANNs receded into the distance as a quaint, historical approach that was fairly computationally expensive and opaque when compared to the other methods.

But Deep Learning has brought the ANN back through a combination of improvements, both minor and major. The most important enhancements include pre-training of the networks as auto-encoders prior to pursuing error-based training using back-propagation or  Contrastive Divergence with Gibbs Sampling. The critical other enhancement derives from Schmidhuber and others work in the 90s on managing temporal presentations to ANNs so the can effectively process sequences of signals. This latter development is critical for processing speech, written language, grammar, changes in video state, etc. Back-propagation without some form of recurrent network structure or memory management washes out the error signal that is needed for adjusting the weights of the networks. And it should be noted that increased compute fire-power using GPUs and custom chips has accelerated training performance enough that experimental cycles are within the range of doable.

Note that these are what might be called “computer science” issues rather than “brain science” issues. Researchers are drawing rough analogies between some observed properties of real neuronal systems (neurons fire and connect together) but then are pursuing a more abstract question as to how a very simple computational model of such neural networks can learn.… Read the rest

The Ethics of Knowing

In the modern American political climate, I’m constantly finding myself at sea in trying to unravel the motivations and thought processes of the Republican Party. The best summation I can arrive at involves the obvious manipulation of the electorate—but that is not terrifically new—combined with a persistent avoidance of evidence and facts.

In my day job, I research a range of topics trying to get enough of a grasp on what we do and do not know such that I can form a plan that innovates from the known facts towards the unknown. Here are a few recent investigations:

  • What is the state of thinking about the origins of logic? Logical rules form into broad classes that range from the uncontroversial (modus tollens, propositional logic, predicate calculus) to the speculative (multivalued and fuzzy logic, or quantum logic, for instance). In most cases we make an assumption based on linguistic convention that they are true and then demonstrate their extension, despite the observation that they are tautological. Synthetic knowledge has no similar limitations but is assumed to be girded by the logical basics.
  • What were the early Christian heresies, how did they arise, and what was their influence? Marcion of Sinope is perhaps the most interesting one of these, in parallel with the Gnostics, asserting that the cruel tribal god of the Old Testament was distinct from the New Testament Father, and proclaiming perhaps (see various discussions) a docetic Jesus figure. The leading “mythicists” like Robert Price are invaluable in this analysis (ignore first 15 minutes of nonsense). The thin braid of early Christian history and the constant humanity that arises in morphing the faith before settling down after Nicaea (well, and then after Martin Luther) reminds us that abstractions and faith have a remarkable persistence in the face of cultural change.
Read the rest

Twilight of the Artistic Mind

Kristen Stewart, of Twilight fame, co-authored a paper on using deep learning neural networks in her new movie that she is directing. The basic idea is very old but the details and scale are more recent. If you take an artificial neural network and have it autoencode the input stream with bottlenecking, you can then submit any stimulus and will get some reflection of the training in the output. The output can be quite surreal, too, because the effect of bottlenecking combined with other optimizations results in an exaggeration of the features that define the input data set. If the input is images, the output will contain echoes of those images.

For Stewart’s effort, the goal was to transfer her highly stylized concept art into the movie scene. So they trained the network on her concept image and then submitted frames from the film to the network. The result reflected aspects of the original stylized image and the input image, not surprisingly.

There has been a long meditation on the unique status of art and music as a human phenomenon since the beginning of the modern era. The efforts at actively deconstructing the expectations of art play against a background of conceptual genius or divine inspiration. The abstract expressionists and the aleatoric composers show this as a radical 20th Century urge to re-imagine what art might be when freed from the strictures of formal ideas about subject, method, and content.

Is there any significance to the current paper? Not a great deal. The bottom line was that there was a great deal of tweaking to achieve a result that was subjectively pleasing and fit with the production goals of the film.… Read the rest

Apprendre à traduire

Google’s translate has always been a useful tool for awkward gists of short texts. The method used was based on building a phrase-based statistical translation model. To do this, you gather up “parallel” texts that are existing, human, translations. You then “align” them by trying to find the most likely corresponding phrases in each sentence or sets of sentences. Often, between languages, fewer or more sentences will be used to express the same ideas. Once you have that collection of phrasal translation candidates, you can guess the most likely translation of a new sentence by looking up the sequence of likely phrase groups that correspond to that sentence. IBM was the progenitor of this approach in the late 1980’s.

It’s simple and elegant, but it always was criticized for telling us very little about language. Other methods that use techniques like interlingual transfer and parsers showed a more linguist-friendly face. In these methods, the source language is parsed into a parse tree and then that parse tree is converted into a generic representation of the meaning of the sentence. Next a generator uses that representation to create a surface form rendering in the target language. The interlingua must be like the deep meaning of linguistic theories, though the computer science versions of it tended to look a lot like ontological representations with fixed meanings. Flexibility was never the strong suit of these approaches, but their flaws were much deeper than just that.

For one, nobody was able to build a robust parser for any particular language. Next, the ontology was never vast enough to accommodate the rich productivity of real human language. Generators, being the inverse of the parser, remained only toy projects in the computational linguistic community.… Read the rest

Boredom and Being a Decider

tds_decider2_v6Seth Lloyd and I have rarely converged (read: absolutely never) on a realization, but his remarkable 2013 paper on free will and halting problems does, in fact, converge on a paper I wrote around 1986 for an undergraduate Philosophy of Language course. I was, at the time, very taken by Gödel, Escher, Bach: An Eternal Golden Braid, Douglas Hofstadter’s poetic excursion around the topic of recursion, vertical structure in ricercars, and various other topics that stormed about in his book. For me, when combined with other musings on halting problems, it led to a conclusion that the halting problem could be probabilistically solved by an observer who decides when the recursion is too repetitive or too deep. Thus, it prescribes an overlay algorithm that guesses about the odds of another algorithm when subjected to a time or resource constraint. Thus we have a boredom algorithm.

I thought this was rather brilliant at the time and I ended up having a one-on-one with my prof who scoffed at GEB as a “serious” philosophical work. I had thought it was all psychedelically transcendent and had no deep understanding of more serious philosophical work beyond the papers by Kripke, Quine, and Davidson that we had been tasked to read. So I plead undergraduateness. Nevertheless, he had invited me to a one-on-one and we clashed over the concept of teleology and directedness in evolutionary theory. How we got to that from the original decision trees of halting or non-halting algorithms I don’t recall.

But now we have an argument that essentially recapitulates that original form, though with the help of the Hartmanis-Stearns theorem to support it. Whatever the algorithm that runs in our heads, it needs to simulate possible outcomes and try to determine what the best course of action might be (or the worst course, or just some preference).… Read the rest

A Big Data Jeremiad and the Moral Health of America

monopolydude2The average of polls were wrong. The past-performance-weighted, hyper-parameterized, stratified-sampled, Monte Carlo-ized collaborative predictions fell as critically short in the general election as they had in the Republican primary. There will be much soul searching to establish why that might have been; from ground game engagement to voter turnout, from pollster bias to sampling defects, the hit list will continue to grow.

Things were less predictable than it seemed. During the 2008 and 2012 elections, the losing party proxies held that the polls were inherently flawed, though they were ultimately predictive. Now, in 2016, they were inherently flawed and not at all predictive.

But what the polls showed was instructive even if their numbers were not quite right. Specifically, there was a remarkable turn-out for Trump among white, less-educated voters who long for radical change to their economic lives. The Democratic candidate was less clearly engaging.

Another difference emerged, however. Despite efforts to paint Hillary Clinton as corrupt or a liar, objective fact checkers concluded that she was, in fact, one of the most honest candidates in recent history, and that Donald Trump was one of the worst, only approximated by Michelle Bachman in utter mendacity. We can couple that with his race-bating, misogyny, hostility, divorces, anti-immigrant scapegoating, and other childish antics. Yet these moral failures did not prevent his supporters from voting for him in numbers.

But his moral failures may be precisely why his supporters found him appealing. Evangelicals decided for him because Clinton was a threat to overturning Roe v. Wade, while he was an unknown who said a few contradictory things in opposition. His other moral issues were less important—even forgivable. In reality, though, this particular divide is an exemplar for a broader division in the moral fabric of America.… Read the rest

Startup Next

I’m thrilled to announce my new startup, Like Human. The company is focused on making significant new advances to the state of the art in cognitive computing and artificial intelligence. We will remain a bit stealthy for another six months or so and then will open up shop for early adopters.

I’m also pleased to share with you Like Human’s logo that goes by the name Logo McLogoface, or LM for short. LM combines imagery from nuclear warning signs, Robby the Robot from Forbidden Planet, and Leonardo da Vinci’s Vitruvian Man. I think you will agree about Mr. McLogoface’s agreeability:

logo-b

You can follow developments at @likehumancom on Twitter, and I will make a few announcements here as well.… Read the rest