Sylvia Plath has some thoughts about pregnancy in “Metaphors”:
I’m a riddle in nine syllables,
An elephant, a ponderous house,
A melon strolling on two tendrils.
O red fruit, ivory, fine timbers!
This loaf’s big with its yeasty rising.
Money’s new-minted in this fat purse.
I’m a means, a stage, a cow in calf.
I’ve eaten a bag of green apples,
Boarded the train there’s no getting off.
It seems, at first blush, that metaphors have some creative substitutive similarity to the concept they are replacing. We can imagine Plath laboring over this child in nine lines, fitting the pieces together, cutting out syllables like dangling umbilicals, finding each part in a bulging conception, until it was finally born, alive and kicking.
OK, sorry, I’ve gone too far, fallen over a cliff, tumbled down through a ravine, dropped into the foaming sea, where I now bob, like your uncle. Stop that!
Let’s assume that much human creativity occurs through a process of metaphor or analogy making. This certainly seems to be the case in aspects of physics when dealing with difficult to understand new realms of micro- and macroscopic phenomena, as I’ve noted here. Some creative fields claim a similar basis for their work, with poetry being explicit about the hidden or secret meaning of poems. Moreover, I will also suppose that a similar system operates in terms of creating networks of semantics by which we understand the meaning of words and their relationships to phenomena external to us, as well as our own ideas. In other words, semantics are a creative puzzle for us.
What do we know about this system and how can we create abstract machines that implement aspects of it?
There is a common theme in much of machine learning that the semantics of words and phrases can be captured by looking at their co-occurrences with other words and phrases. This is certainly the manner in which language models operate, including the recent massive-scale models of GPT-3, but also in Latent Semantic Analysis, in Random Indexing, in a whole raft of probabilistic mixture models, and, especially in methods that use artificial neural networks to associate words and goals together based on gleaning statistically significant co-occurrence patterns from training sets and then encoding them as weights between artificial neurons.
I’ll note that this is only one way of approaching the problem. There have also been efforts to encode the meaning of words using some kind of knowledge formalisms: cow IS-A mammal IS-A animal, etc. While this has often proved to be fairly brittle for many applications, it remains useful in areas of highly specialized language, like biology where knowledge bases and ontologies are used for manual and automated document coding. An obvious and difficult aspect to this approach is finding agreement among the ontologists who work to build the system.
In both approaches, similarity has something to do with the distances between words/phrases or their clusters, with a metric distance characterizing the first approach and a network span the second. But metaphorical similarity requires some kind of patching to this system. Nowhere would a “melon strolling on two tendrils” find similarity to pregnancy. That activation must be part of another system overlaying any semantics based on co-occurrence or ontological similarity.
Some psychological research is also helpful here. Most of these ideas are metaphorical themselves, borrowing the idea of evolutionary variation and retention, or reducing a collection of options by winnowing out poor fitting solutions. Liane Gabora at University of British Columbia has been investigating this topic for several decades and has settled on a “honing” theory of creative problem solving. She distinguishes this honing approach from evolutionary metaphors by noting that the potential solutions all pre-exist in the mental inventory and then are thought through by being exposed to different mental trials. She does concede that the rawer notion of evolutionary variation in theory has been replaced by a much more complex series of mechanisms, specifically citing autocatalytic sets but, I assume, also open to ideas like jumping genes or reservoirs of genetic material if pressed on how far to distance herself from evolution as a metaphor for the mental process.
But what might an architecture for this honing system look like if we use current approaches to machine learning to flesh it out? I am curious about this because I am experimenting with some variants of the Random Indexing methodologies. In RI, words or documents or sequences of words (contexts) are represented by sparse vectors. There is some elegant math that shows that summing up sparse vectors among shared word contexts has a very similar result to other approaches (but is super simple by comparison). Words are known by the company they keep and as the sparse vectors of neighboring words get added to their own vector, that associative semantic context vector rotates around in high dimensional space and distinguishes usages among the different meanings of the term.
My experiments have been trying to use alternative kernels to capture the effects of word order, recursion, or sequences in data. But to get honing in the picture, we need to gather the nearest neighbors in this space and eliminate the neighbors based on a criteria that extends beyond their distance from the probe concept—that actively works to achieve a result like text generation. So far, so good.
Now if we can just bring this child to term.
Very interesting. I wish I could keep up with the technical sophistication, but I can’t, so please excuse my ignorance on the topic. I similarly think of language as “associative models” that are built up over time simply as a function of experience (perception).
If I hear the word “postmodernism” and it is associated with certain people, phrases, personal experiences, then that shapes the word in my head. This “associative model” is necessarily different for other people. As a result, no two people can ever _truly_ talk about the same thing because their associative models never completely overlap. However, at a certain level of abstraction, they overlap “enough” and so we are able to communicate. The degree to which they don’t overlap is a great source of a miscommunication in the world. This is why philosophers fixate so ferociously on definitions and ontologies – they want to know they are in fact talking about the same “thing” (within a reasonable level of resolution).
Because associative models are constantly evolving as a function of frequency and salience, they can in fact be gamed. I call this “semantic gerrymandering” and the advertising industry in particular excels here. Can I change the way you think of “recycling” or “police” based on certain ontological entities I yoke them to? Alternatively this may be described as “framing” or “Russell conjugations” or the “is-ought alloy” (https://everythingstudies.com/2018/02/12/wordy-weapons-of-is-ought-alloy/).
There are certain abstract “primitives” from which we build everything else. We associate higher level concepts (e.g. democracy, consensus) to low-level concepts (e.g. veridicality, perception, etc.) as a function of our experiences with them (e.g. what we are told, what we logically deduce and so on). But all of these models (“metaphors”) – like sand – are constantly shifting as a function of our experiences. Some ground is fortunately more solid than others, and perhaps when it is not, we call that a “midlife crisis”.
Metaphorical thinking is how we make sense of the world – how our internal models of reality conform to our observations of it. It is the great human achievement that ties together artists and scientists and businesspeople and philosophers.
No disagreement with any of that, Alex. Framing has been used as a technical approach in AI, as well. I use the metaphor of “effective procedures” from cognitive science as a way of trying to ground how we might think about these semantic issues, but the problems clearly stretch through all our human activities.