Leading Edge / Summer 2012
Words, by the Numbers

What can a computer reveal about a work of fiction? Plenty, it seems


Illustration by Isabel Foo” title=

People have been analyzing written texts for over a thousand years, dissecting sentences to reveal the hidden truths beneath. But while human readers are terrific at uncovering complex and subtle meanings, they read slowly and have relatively short attention spans. Not so computers, which are increasingly being used to find interesting new patterns in texts undetected by mere mortals.

By counting recurring word and phrase types, algorithms such as the “Gender Genie” purport to determine whether a given text was written by a male or female author. Computers have helped historians sort out questions of authorship in the Bible. And the work of U of T English professor Ian Lancashire and computer scientist Graeme Hirst suggests that mystery writer Agatha Christie may have suffered from an undiagnosed case of Alzheimer’s disease. By comparing her earlier volumes with later ones, their research showed that the author’s vocabulary decreased significantly in her last two books.

Now, in a course called “The Digital Text,” English instructor Adam Hammond is using computers to undertake literary analysis; his work differs from the aforementioned projects in that he’s more interested in what a text is saying than in determining attributes of the author.

Last fall, Hammond began using computers to tease out the mystery behind T.S. Eliot’s The Waste Land. The poem is surely among the 20th century’s most famous – it’s the one that begins by telling us that “April is the cruellest month.” But with its welter of disconnected voices, obscure allusions and steep plunges into Latin, German and Sanskrit, many readers find the poem itself to be cruel.

“Eliot wrote The Waste Land in 1922,” says Hammond. “From that point on he was actively trying to become a playwright; he actually did not want to be a poet anymore.” Consequently, Hammond believes the poem’s arbitrarily shifting lines make more sense if you think of them as spoken by cast members in a play – albeit one lacking in stage directions or character names.

To help the reader determine where one “character” ends and another begins, Hammond and the students spent last fall going through the 434-line work and manually tagging significant features of the text (such as parts of speech, alliterations and foreign phrases). Hammond then delivered the heavily annotated result to U of T computer scientist Julian Brooke, who has generated a computer algorithm that will help future readers find the vocal switches within seconds.

Human scholars have hit on this “vocal switching” theory before (in fact, you can hear Eliot himself “acting” out the poem on a new app for The Waste Land). But the computer identifies the voices much faster – not that it’s any more accurate, says Hammond. But then, literary analysis always has been an imperfect science. “It’s exciting to see how a computer reader differs from a human reader, but that’s not to say the computer is ‘right.’ Works of literature are so human and none has a single ‘correct’ interpretation.”

Even though he’s a newly minted PhD specializing in the modernist movement of a century ago (in which Eliot was a key figure), Hammond sees clear links between the 100-year old literature he loves and the challenging, technologically inspired work of today. “Modernist literature is relevant today because modernists faced so many of the same issues we’re facing. Just like us, they were trying to come to grips with a world reshaped by rapid technological change. And just like us, they were bravely attempting to use new technologies – the same ones that had made the world so unfamiliar – to understand their new reality.”


Add a Comment

required, use real name
required, Not for Publication
optional, eg: BSc 2008

Next story in this issue: »
Previous story in this issue: «