Year
of·the
Meteor

This is Robin Sloan’s workspace for 2019. The newsletter is the best way to follow along; it goes out every Sunday.

Notes from the quest factory

Recently, I used an AI trained on fantasy novels to generate custom stories for about a thousand readers. The stories were appealingly strange, they came with maps (MAPS!), and they looked like this:

Here, I want to share some notes that might be useful to other people doing similar projects, and/or people who imagine they might.

I’ll add that I am probably going to produce another AI-adjacent offering like this, and the best (only?) way to hear about such a thing is to sign up for my newsletter, which you can do at the top or bottom of this page.

Okay—first I’ll do philosophy, then technology. You can skip ahead if you like!

“I see what you did there”

Honestly, I think the key to this project wasn’t the AI but the paper.

I’m very happy to have discovered Lob, a service that allows you to print and mail things using code. How are these things printed? From where are they mailed? I have no idea, which is mildly disconcerting, but also mildly magical. I mean, this function—

response = lob.letters.create({
  description: "Letter for #{purchase["email"]}",
  to: {
    name: purchase["ship_to_name"]
    # etc
  },
  from: {
    name: "Year of the Meteor"
    # etc
  },
  file: pdf_name,
  double_sided: true,
  mail_type: "usps_first_class"
  },
  {"Idempotency-Key" => purchase["email"]}
)

—sends a letter in the mail! For about a dollar! That’s wild!

Why both printing and mailing these AI-generated stories, though? I could have built a quest generator on the web, accessible for free. A series of prompts; a map; a squirt of AI.

But… then what?

People might have found their way to it and laughed for a moment at what emerged. Snapped screenshots, posted them. And then, on to the next bauble! There’s no shortage. Perhaps you’ve crafted some of these baubles yourself. You might know this feeling.

Another day, another “I see what you did there.”

But, because these stories were delivered physically, I have photographs of letters in people’s front yards. In their houses. WITH THEIR DOGS.

I was attracted to AI language models in the first place because they showed me sentences that had a strange and ineffable flavor. It’s like English as a second language, except the first isn’t Spanish or Swedish but rather, I don’t know, Martian. For someone who enjoys words, who likes it when sentences are weird and/or beautiful and/or unexpected, that’s obviously appealing.

But, if that’s the appeal, then the challenge is to get people to actually READ THE SENTENCES. Not just appreciate the framing; not just nod at the technology.

Upon encountering these quests, did readers’ souls quiver? Did their eyes film with tears, blurring the page? Er, no. But some of them really did spend some time with the text. For me, that’s crucial; non-negotiable. “I see what you did there” is weak gruel. I am in this to have people read things.

Okay, enough aesthetic hand-wringing. Now for the nerdy stuff!

The skeleton

Here, I’ll outline the process I used to generate these quests.

You can see the invitation to participate archived here. I think it was probably too cryptic; I’m not sure I mind. Ultimately, it enticed about a thousand people to pay a few dollars and fill out a Google Form, specifying things like the name of their quest’s leader, the kind of artifact their questers sought, the species of creature encountered on the road—you know, quest essentials!

Even more essential to a quest, perhaps, is a map.

AHHH I love it

Using Ryan Guy’s terrific Fantasy Map Generator code, I churned out a few thousand maps, each different, but/and also very similar to the one above. (And, let’s be real… these maps are the stars of the show. You can stop reading now.)

The place names all came from a tiny neural network trained on a selection of real place names from world history. Reviewing the input file now, I see that I used lists of towns in England, Italy, France, Denmark, Japan, and ancient Rome. Neural networks can work as blenders, mixing up structures and phonemes in an appealing way. They are really, really good at names!

Next, downloaded the quest design form responses. Using a Ruby script, each reader was assigned a map, and the place names on that map were combined with their responses to produce a “story skeleton” that I could feed into the AI text generator.

I need to pause here for a bit of background. The text generator I used was GPT-2, a powerful language model developed by San Francisco’s OpenAI. GPT-2 was initially trained on many gigabytes of text from the web. I continued that training—“fine-tuning” the model—on several hundred megabytes of fantasy novels. My personal GPT-2 now very strongly believes that most sentences ought to be about shadowy keeps and road-weary rangers. (I do not disagree.)

GPT-2’s code gives you the option to provide “context.” Before you ask the model to generate text, you can feed in a sequence of characters to establish, basically, what’s going on in the story. If you do so, GPT-2 will dutifully refer back to the names, the places, and, to a degree, the situations included in that context. It doesn’t stay perfectly consistent—any human writer could do better—but this is a capability that has, until now, eluded AI language models entirely.

This notion of context was key to the quest generation process. I would alternate between getting text out of GPT-2 and feeding prompts in from the story skeleton—in effect, guiding GPT-2 along a particular path.

The Ruby code to produce one story skeleton from a single reader’s map and form looked like this:

  prompt "#{format_for_start(survey[:group])} began \
          their quest in #{city1}, a city known for", 3

  prompt "This quest to defeat the Dark Lord \
          was led by #{survey[:leader]}, who", 2

  prompt "The questers sought #{survey[:seek]}, which", 3

  prompt "They intended to travel #{survey[:travel]}, \
          but, unfortunately,", 1

  prompt "Then, on the road toward #{city2}, \
          they encountered #{survey[:encounter]}. It", 3

  prompt "The questers crossed into the \
          country called #{country1}, known for", 2

  prompt "There, in #{country1}, the Dark Lord found them. He", 2

  prompt "The Dark Lord cruelly", 2

  prompt "Did their quest fail because the questers \
          desired only #{survey[:desire]}? Or was it", 2

  prompt "#{survey[:leader]}'s last thoughts were", 1

  prompt "The world was quiet.", 1, ""

If there’s any part of my process that’s even a little bit novel or interesting, this is it, so I want to pause and point out a few things.

First: I can specify how many sentences I want with the number that follows the prompt text. This is a crucial artistic control! GPT-2 generates a sequence of fixed length; you can’t ask it for “just two sentences, please.” But you can take the fixed-length sequence, break it into sentences yourself (simply splitting it on periods works great), and then only use as many as you want.

Second: notice the words I use at the ends of the prompts. I am hardly an AI whisperer, but I do think I’ve learned a bit about nudging a language model towards interestingness. These systems are, in general, very content to just… hang out. They love to describe a scene, then re-describe it, and describe it again, rather than advance the plot with a twist or a turn. (In their defense: they don’t know what a plot is, or a twist, or a turn.) Notice, in the fourth prompt above, the “but, unfortunately,” which produced reliably fun results. You can see that almost all of my prompts “set up” GPT-2 in this way. (And, by contrast, a different version of this template without those guiding words produced stories with palpably less “going on.”)

Third: look closely at the final prompt. Notice the empty string at the end:

  prompt "The world was quiet.", 1, ""

As I was fiddling with these prompts, my friend Dan proposed an idea: what if the text that GPT-2 received and the text the reader read were sometimes different? In the case above, what’s happening is that GPT-2 is seeing the line “the world was quiet,” which will influence the text it generates; however, “the world was quiet” is not being shown to the reader. The reader is instead seeing… nothing. An empty string. So the reader sees only GPT-2’s response to “the world was quiet,” which in practice goes something like

No fires burned, and no lamps were lit.

or

Every so often, a breeze would rustle the trees and make them shimmer.

or

For a few moments, he thought he heard the distant sound of an ancient love song.

I think that’s really lovely! There’s no need to preface those lines with “the world was quiet”; they communicate that on their own. This technique of showing text to GPT-2 that you conceal from the reader is a sneaky way of telling the system what you want. It’s the hidden agenda, the moon behind the clouds. I think it’s potentially very powerful, but/and I’ve only scratched the surface here.

The output of the code above was a text file that looked like this:

A pair of thieves began their quest in Easy, a city known for|3
This quest to defeat the Dark Lord was led by Fenris Tusk, who|2
The questers sought a lost grimoire, which|3
They intended to travel quickly, but, unfortunately,|1
Then, on the road toward Lod Herley, they encountered an elk. It|3
The questers crossed into the country called Hagerobonou, known for|2
There, in Hagerobonou, the Dark Lord found them. He|2
The Dark Lord cruelly|2
Did their quest fail because the questers desired only peace? Or was it|2
Fenris Tusk's last thoughts were|1
The world was quiet.|1|

After I’d generated one of those files for each reader, how did I use it?

A Python script fed the file’s first prompt into GPT-2 as context, then asked it for a blast of text. Next, it filtered that text heavily: truncating to a desired number of sentences, as discussed above; rejecting if wonky (for example, if it included the strings “www” or “http”); and, importantly, checking for words I would never use in my own writing. (For this, I relied on Darius Kazemi’s wordfilter, bulked up with additional words and phrases of my choosing. If you’re using a language model to generate text that will be shown to humans other than you, you must include a step like this. For me, it was crucial to generate a bunch of stories, scout them for scenes or even just ~implications~ I found skeezy or upsetting, and then add filters to reject that kind of content. The stock wordfilter wouldn’t have caught it all, and I wouldn’t have imagined it all, just sitting and speculating. I had to survey the output.)

When the Python script had text in hand that passed all those tests, it fed (1) the original prompt, (2) the text generated in response, and (3) the next prompt back into GPT-2, all concatenated. In this way, the context grew and grew, always a mixture of reader-provided prompts and GTP-2’s own “imagination,” so both could influence the story as it unfolded.

The finished quest was deposited into a plain text file, which another Ruby script transformed into a PDF, which yet another Ruby script sent to Lob for printing and mailing.

You can see an example of a finished quest PDF here.

I’ll close with one more reflection. Let’s imagine it’s ten years from now, and the super-powerful language model called GPT-2000 can produce an entire fantasy novel all on its own. It does a very competent job, too! The plot is pretty cool, the characters are fun, and every so often, there’s a truly beautiful sentence.

So what?

There’s no shortage of fantasy novels that meet those requirements. In fact, there are already more than (almost) any person can read. They’re available very cheaply or even, if you have access to a public library, for free. So, the potential of this technology isn’t, like, “At last! Something to read!”

What is it, then?

It’s odd to sit and look at this directory of quest stories I generated. There’s more than a thousand; I’ll never read them all. When I want to read just one, how do I choose? Randomly, of course. How else?

Now, let’s say the directory wasn’t just stories but full-blown GPT-2000 fantasy novels, a thousand of them, each totally new, never before read by anyone! As I consider that possibility, I ask myself: is the feeling one of great bounty—like a well-stocked fantasy aisle at a library—or is it… something else? I think maybe the directory feels overwhelming, or numbing, or even horrifying.

Let’s say I want to read one of the GPT-2000 novels. Do I just choose a file randomly, as before? I’d be the only one to read that novel, ever. If it was great, there would be no one who I could talk about it with. If it was great, the novel just below it might be even better, but I’d never know.

Reading the torrent of text generated by a language model, realizing how much of it is, in fact, great—not whole novels worth, of course, or even whole stories, but sentences and paragraphs, definitely; they’re cool and knotty and delightful—and then seeing that text disappear, scrolled away into oblivion, replaced by more text that’s marbled just as richly with greatness, you realize: there’s no shortage of great language. But great language isn’t what makes a story great. It isn’t what makes a story at all.

In the snippet below, the AI-generated text is quite good—

—but it’s clear that the best thing on the page, the thing that makes it glow, is the part supplied by a person.

For as capable as GPT-2 and its offshoots become, the thing that will make their output worthy of our attention is UNHELPFUL PUMPKINS.

June 2019, Oakland

This website uses the typeface Albertus Nova, an update by Toshi Omagari of Berthold Wolpe’s classic, and GT America, designed by Noël Leu with Seb McLauchlan.

Have a question? Spot a typo? Your story didn’t arrive? Email robin@robinsloan.com

hony soyt qui mal pence