Architects, Farmers, and Patterns

The distinction between writing code and writing prose is not as great as some might imagine. I recently read an article that compared novelists’ metaphors concerning writing. The distinctions included the “architect” who meticulously plans out the structure of the novel. Plots, characters, sets, chapter structure—everything—are diagrammed and refined prior to beginning writing. All that remains is word choice, dialogue, and the gritty details of putting it all on a page. Compare this to the “farmer” approach where a seed is planted in the form of a key idea or plot development. The writer begins with that seed and nurtures it in a continuous process of development. When the tree grows lopsided, there is pruning. When a branch withers, there is watering and attention. The balanced whole builds organically and the architecture is an emergent property.

Coding is similar. We generally know the architecture in advance, though there are exceptions in the green fields. Full stack development involves decoupled database back ends, front end load balancers and servers, and middleware/coding of some stripe. Machine learning involves data acquisition, cleaning, training, and evaluation. User experience components then rely on “patterns” or mini-architectures like Model-View-Controller and similar ideas pop up in the depths of the model and controller, like “factory” patterns that produce objects, or flyweights, adaptors, iterators, and so forth. In the modern world of agile methodologies, the day-to-day development of code is driven by “stories” that are short descriptions of the goals and outcomes of the coding, drawing back to the analogy with prose development. The patterns are little different from choosing to use dialogue or epistolary approaches to convey parts of a tale.

I do all of the above when it comes to writing code or novels.… Read the rest

One Shot, Few Shot, Radical Shot

Exunoplura is back up after a sad excursion through the challenges of hosting providers. To be blunt, they mostly suck. Between systems that just don’t work right (SSL certificate provisioning in this case) and bad to counterproductive support experiences, it’s enough to make one want to host it oneself. But hosting is mostly, as they say of war, long boring periods punctuated by moments of terror as things go frustratingly sideways. But we are back up again after two hosting provider side-trips!

Honestly, I’d like to see an AI agent effectively navigate through these technological challenges. Where even human performance is fleeting and imperfect, the notion that an AI could learn how to deal with the uncertain corners of the process strikes me as currently unthinkable. But there are some interesting recent developments worth noting and discussing in the journey towards what is named “general AI” or a framework that is as flexible as people can be, rather than narrowly tied to a specific task like visually inspecting welds or answering a few questions about weather, music, and so forth.

First, there is the work by the OpenAI folks on massive language models being tested against one-shot or few-shot learning problems. In each of these learning problems, the number of presentations of the training data cases is limited, rather than presenting huge numbers of exemplars and “fine tuning” the response of the model. What is a language model? Well, it varies across different approaches, but typically is a weighted context of words of varying length, with the weights reflecting the probabilities of those words in those contexts over a massive collection of text corpora. For the OpenAI model, GPT-3, the total number of parameters (words/contexts and their counts) is an astonishing 175 billion using 45 Tb of text to train the model.… Read the rest