Machine Learning and the Coming Robot Apocalypse

Daliesque creepy dogsSlides from a talk I gave today on current advances in machine learning are available in PDF, below. The agenda is pretty straightforward: starting with some theory about overfitting based on algorithmic information theory, we proceed on through a taxonomy of ML types (not exhaustive), then dip into ensemble learning and deep learning approaches. An analysis of the difficulty and types of performance we get from various algorithms and problems is presented. We end with a discussion of whether we should be frightened about the progress we see around us.

Note: click on the gray square if you don’t see the embedded PDF…browsers vary.Read the rest

Informational Chaff and Metaphors

chaffI received word last night that our scholarship has received over 1400 applications, which definitely surprised me. I had worried that the regional restriction might be too limiting but Agricultural Sciences were added in as part of STEM so that probably magnified the pool.

Dan Dennett of Tufts and Deb Roy at MIT draw parallels between informational transparency in our modern world and biological mechanism in Scientific American (March 2015, 312:3). Their article, Our Transparent Future (related video here; you have to subscribe to read the full article), starts with Andrew Parker’s theory that the Cambrian Explosion may have been tied to the availability of light as cloud cover lifted and seas became transparent. An evolutionary arms race began for the development of sensors that could warn against predators, and predators that could acquire more prey.

They continue on drawing parallels to biological processes, including the concept of squid ink and how a similar notion, chaff, was used to mask radar signatures as aircraft became weapons of war. The explanatory mouthful of the Multiple Independent Reentry Vehicle (MIRV) with dummy warheads to counter anti-ballistic missiles were likewise a deceptive way of reducing the risk of interception. So Dennett and Roy “predict the introduction of chaff made of nothing but megabytes of misinformation,” designed to deceive search engines of the nature of real info.

This is a curious idea. Search engine optimization (SEO) is a whole industry that combines consulting with tricks and tools to try to raise the position of vendors in the Google rankings. Being in the first page of listings can be make-or-break for retail vendors, and they pay to try to make that happen. The strategies are based around trying to establish links to the vendor from individuals and other pages to try to game the PageRank algorithm.… Read the rest

Inequality and Big Data Revolutions

industrial-revolutionsI had some interesting new talking points in my Rock Stars of Big Data talk this week. On the same day, MIT Technology Review published Technology and Inequality by David Rotman that surveys the link between a growing wealth divide and technological change. Part of my motivating argument for Big Data is that intelligent systems are likely the next industrial revolution via Paul Krugman of Nobel Prize and New York Times fame. Krugman builds on Robert Gordon’s analysis of past industrial revolutions that reached some dire conclusions about slowing economic growth in America. The consequences of intelligent systems on everyday life will have enormous impact and will disrupt everything from low-wage workers through to knowledge workers. And how does Big Data lead to that disruption?

Krugman’s optimism was built on the presumption that the brittleness of intelligent systems so far can be overcome by more and more data. There are some examples where we are seeing incremental improvements due to data volumes. For instance, having larger sample corpora to use for modeling spoken language enhances automatic speech recognition. Google Translate builds on work that I had the privilege to be involved with in the 1990s that used “parallel texts” (essentially line-by-line translations) to build automatic translation systems based on phrasal lookup. The more examples of how things are translated, the better the system gets. But what else improves with Big Data? Maybe instrumenting many cars and crowdsourcing driving behaviors through city streets would provide the best data-driven approach to self-driving cars. Maybe instrumenting individuals will help us overcome some of things we do effortlessly that are strangely difficult to automate like folding towels and understanding complex visual scenes.

But regardless of the methods, the consequences need to be considered.… Read the rest

Hits and MITS

I just came across the following scan that describes how an MITS Altair 8800B became my first personal computer, journeying from Albuquerque to Las Cruces, New Mexico and getting an EEPROM burner attached to a ribbon cable snaking out through the enameled steel case. The speech synthesizer predated stored digital samples, by the way, so it instead emulated phonemic mixtures generated by digital waveform filtering.

And that young chap on the left would later become my boss, three or four parts removed, at Microsoft. I still have that Altair, too, safely stored away.

MITS Altair Convention

Original PDF Scan:

MITS Computer Convention

 

 … Read the rest