Poetry and Artificial Intelligence are tricky to define

November 06, 2018

“We believe that poetry transmits an emotional message from one person to another which bears a distinctive token of humanity, something that only a human soul can convey and share with another human soul... [Now] new tools provide an infinite potential for researching language and probing our relationship with words, and I would recommend that everyone consider adding these tools to their poetic tool belt. I think, however, that we should be careful when using or even thinking of the future of technology. The assessment of technological evolution should take into account who gains power and who loses it.” – Eran Hadas

As most readers of this text are familiar with poetic discourse, it will focus on AI entities, especially those which try to automatically generate poetry or poetic texts. From a future intelligent computer’s perspective, both “artificial” and “Intelligence” are biased words in the human/computer dichotomy, hinting that humans are “natural” and “originally” intelligent, whereas computers are imitating humans with no sense of independence or originality. Alan Turing’s 1950 paper, Computing Machinery and Intelligence, posed the question “Can Machines think?” and from our perspective today, this is what AI is all about: machines programmed to "think", act and make decisions the way a person does.

Writing poetry on the other hand, seems one of the least mechanical activities a human is capable of doing. Most humans consider it detached from actual functionality, as poems don’t attribute a set consequence to the material world. Poetry is located within the mind but often has no physicality; it is open to a wide range of interpretations, and it is unplanned and unexpected in many cases, being an expression of the inexpressible. All these make us believe that poetry transmits an emotional message from one person to the other, that carries a distinctive token of humanity, something that only a human soul can carry and share with another human soul.

Some poets have tried to minimize the gap between humans and machines (even before computers) by fulfilling the human passion to write like a machine. Gertrude Stein’s Three Lives (1909) and Tender Buttons (1914) followed on a study she conducted about writing like a machine, as documented in the paper Normal Motor Automatism (1896). Many poets agree with William Carlos Williams: "A poem is a small (or large) machine made of words." Other poets do not refer to a machine, but offer strict rules for which the output is a poem. A well-known example is Tristan Tzara’s How to Make a Dadaist Poem (1918).

Modern computers became part of our lives after the Second World War, in 1950, when the first commercial computer was sold. Two years later, there were already computer generated texts. The most well-known such project is Love Letters by Christopher Strachey, reimplemented by digital poet Nick Montfort, who re-coded an entire library of early computer text generators for his project Memory Slam.

2. Generating Poems

The mechanism behind Love Letters is straight forward. The letter is comprised of a template that is populated by word lists. For example, the first line has two parts; an opening adjective and a subject. The adjective is one of the following words: DARLING, DEAR, HONEY, and JEWEL. The subject is one of the following words: DUCK, LOVE, MOPPET, and SWEETHEART. Each time a letter is generated, the two parts are randomly selected. As a result, we receive 16 possible combinations: HONEY MOPPET, DARLING DUCK etc.

Such projects demonstrate that computers are able to generate poetry, albeit it is questionable as to how artificial and/or intelligent these poems are. To be fair, the same applies for human poets. And yet, text generators similar to this one are a valid method for creating poetry that is interesting to readers, depending on the content, rules and concepts being used.

Together with Israeli poet Merhav Yeshorun, I built The Shape of a Man, an automatic poetry generator following Maimonides, aka Rambam, an influential Torah scholar and Jewish philosopher. In his 12th century Guide to the Perplexed he combined Greek philosophy with Judaism, claiming that almost everything in existence could be generally expressed by a bank of about 40 words - the basic building blocks, or rather linguistic DNA of the world. This became a prompt for us to build an automatic “Jewish Zen” poem generator adhering to Maimonides language DNA rules.

Each time a generator is run, it forms a new word combination, for better or worse. What happens, however, if a poet wishes to turn an automatic generator into a printed book? There are two competing approaches: The first is called cherry-picking, in which only the “better” outputs are printed, whereas the least “poetic” ones are left out of the book. The second one is called the exhaustive approach, in which the book is comprised of all possible outcomes of the generator. While the former is aimed at pleasing the readers, the latter can be seen as trying to serve the larger concept of the book.

My book Code (2014) takes the exhaustive approach, to reveal all the Haiku poems in the Pentateuch, or Torah, in Hebrew. These are the five books of Moses which are the foundation of the Jewish holy law and its behavioral code of conduct. The program scans the entire text testing each word for a potential beginning of a “valid” 5-7-5 syllable structure . If it does, the Haiku is recorded in the book, regardless of whether it makes sense or is syntactically correct. The re-rendering of the original text ignores the end of lines (or books) so the fragments that are generated often have different meaning than the original text. Following are the first five poems, extracted from the beginning of Genesis:

a-1
Abyss and spirit
God she is floating upon
The face of waters

a-2
Floating upon the
Face of the waters and so
Said God Let there be

a-3
Let there be a sky
Inside the waters and let
It divide waters

a-4
It that is the sky
Shall separate the waters
That are underneath

a-5
Those are the waters
That are underneath the sky
And between waters

[Tr. from the Hebrew: Keren Katz]

While the story is fragmented, it does resonate the original but also articulates the difference between the original biblical voice and its modern echo.

3. Machine Learning

Rule based generation of texts was an outstanding phenomenon in the 1960s. It lived in harmony with Noam Chomsky’s generative grammar, according to which humans have a language acquisition device similar to that of a computer, so the way we learn how to speak is similar to a poetry generator. Joseph Weizenbaum created the first chatbot, Eliza, using similar methods; a person would type a sentence, and Eliza would answer them with a corresponding sentence.

However, the rules used by those methods are not being made up by the computer. Instead, humans feed the machines with their own rules. The “intelligence” here is to some extent human. Many AI researchers have been looking for methods in which the computer would make up the rules by itself. In 2018, the most popular methods to achieve this goal are called machine learning. In machine learning, instead of being given rules, computer systems learn from data, identify patterns and make decisions with minimal human intervention.

It is important to note that ML systems are built with human code, and utilize human algorithms, so they are not rule-free. However, the rules that form these systems are there for the computer to learn by itself, and none of them deal with the domain of the problem, only with the capability of the computer to learn. In order to learn without rules, computers need data, which comes in the forms of statistics or examples. One form of text generation using statistics and examples would be the automatic search suggestions given by the Google search engine. If I begin keying my name, “Eran”, Google will automatically suggest the completion “Morad”, after a famous Sacha Baron Cohen TV show character (whose first name is actually spelled “Erran”). This is because statistically when people start a search by typing Eran, they are more likely to refer to him than to me.

I used a statistical method called the Hidden Markov Model, to enhance the capabilities of rule-based generators, in my project Self-De-Perecator. This poem pays homage to French author George Perec, a member of the constraint-based literary group Oulipo, pioneers of computational literature who discussed Machine Learning methods in 1961 (and probably before). This project uses the text from Perec’s 1978 Life: A User’s Manual and generates a lament about the loss of memories, using only words from his book. The users can edit the texts.

In order to build templates that may be filled in, the idea here is to identify the part of speech of each word, and then build simple, syntactically correct sentences. For instance, for the sentence, “They refuse to permit us to obtain the refuse permit”, the software has to predict (or guess or understand) that the first occurrences of refuse and permit are verbs whereas the second ones are nouns. Since this method uses statistics to do so, it may make mistakes, especially when it is introduced to words it does not know and hasn’t seen in the data it pre-examined.

4. Deep Learning

In recent years, the most common Machine Learning approach has been a model named Artificial Neural Networks, but also branded as Deep Learning. A Neural Network is comprised of several layers of data units, where each unit transmits a value to some units in the next layer. While part of the motivation of NNs is to simulate the structure of the human brain, this is a purely mathematical structure: When the model is trained and ready to work, it will be fed by a set of input numbers and will produce a set of output numbers, so it is up to the users of the network to encode and decode their problems to numbers with which the network works.

In order to work, the network has to be trained. This means that examples are introduced to the model, the model processes them and then adjusts itself to better handle them. For example, let’s take a model that is given an image and must decide whether it is a dog or a cat. The basic way to train this mode is called supervised training. In each step, the model is presented with an image (of a cat or a dog), and for each image there is an expected result (cat or dog). If the network is wrong, it will change its inner state in order to better fit the image next time. After seeing a large number of examples, it is hoped that the network will adjust, not just to handle the examples it had seen, but also to generalize and make correct predictions on examples it had not seen before.

Recurrent Neural Networks are a models that learn sequences from examples. Some of the popularity this model has gained came after a viral 2015 blog post by Andrej Karpathy, The Unreasonable Effectiveness of Recurrent Neural Networks, in which he showed how, by feeding all the works of Shakespeare into the model, it generated new “Shakespearean” texts.

The network looks at a text as a sequence of characters, and given some opening characters, it predicts the next character. Soon afterwards, poetry projects were created following similar methods such as Sam Ballas’ PoetRNN, along with online manuals on how to create such generators. Compared to older statistical methods (Maximum Likelihood Language Model, see example) the results I got using RNN were not as good, however it enabled many exciting projects to be created, for example Ross Goodwin’s Automatic On The Road, in which an automatic car generates its own version of Jack Kerouac's literary road trip.

Another popular Neural Network based method is called Word Embedding, which basically means mapping words into vectors, which are mathematical entities that have size and direction in space. The first popular Word Embedding algorithm was Google’s Word2Vec (2013), and it became popular due to its surprising results. The Google team, led by Tomas Mikolov, took a large amount of articles, called Google News, and used the network to map each word to a vector. Then they started playing with arithmetic. If we denote a vector into a word is mapped as V(word), they discovered that V(Italy) - V (Rome) + V(Paris) = V(France). In other words, without knowing anything about the words, just by taking into consideration their place inside many documents, the network managed to understand something about the meaning of the words, in particular analogies.

Together with Eyal Gruss I built Word2Dream, whict uses Word2Vec to step out of a given text and traverse a free association space which is hallucinated by the network. The users give the network an initial text, and the network dreams of things the text reminds them of. At first it analyzes the important words inside the text, but soon it starts drifting away on the wings of artificial imagination.

Perhaps the two most promising Deep Learning models are Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs). CNNs are good at processing images, and it has a great potential for poetry as it enables a path from image to text. CNNs are good at identifying objects in an image, but there are also many projects (for example, Karpathy’s) that turn images into sentences, in a way that sometimes resembles Imagist poetry. GANs are basically two different networks that compete among themselves, playing the roles of a counterfeiter trying to trick the police, and a police detective trying to distinguish real from fake. This enables a path from text to image. Using GAN and CNN, Eyal Gruss and I created the Electronic Curator, which turns human face pictures into portraits of vegetables, and then, according to the vegetables, generates a curatorial statement.

5. Epilogue

While many of the new features and achievements of AI seem like science fiction, some of them are available to the public. Every poet can play with sentences by using the Deep Learning powered Google Translation back and forth between languages. We can also use automatic completion suggestions, and many automated smart features that are gradually being added to online email or text editors. These new tools provide an infinite potential for researching language and probing our relationship with words, and I would recommend everyone to consider adding these tools to their poetic tool belt.

However, I think that we should be careful when using or even thinking of the future of technology. Technology is not a bad thing nor a good thing, it is a scene that enables people, corporations and governments to gain power. And as such, I am taking a cyber-skeptic approach. If power is at hand, things that seem naive may be used by forces to gain power on the account of others. The assessment of technological evolution should take into account who gains power and who is losing it. While large corporations give us free access to services such as translation, editing, social capabilities etc., there is a price we pay as end users. We give them (for free) large amounts of data that has a significant economic value. We give up our privacy and some of our independence, and most of all we do all this things feeling fascinated, and often getting addicted to them. Poetry should not turn against AI, because keeping up with this new literacy is required to remain relevant. All aspects of our culture have to weigh into this arena of digital text where language has consequence and is in fact, the infrastructure of our lives. However, in my humble opinion, it has to make a stand to keep it on the side of us individuals, against large corporations.

Click here to view a November 27 event in London with Abol Froushan (in person) discussing AI poetry, with Yasuhiro Yotsumoto and Eran Hadas via video.