By Jessica Lewis
From hepcat to slacks, from right on to whassup, words and phrases have helped novelists and filmmakers evoke a particular time or place. Now, researchers at the University of Toronto have developed software that can carefully and reliably determine the dates of medieval British documents based on the appearance of popular words or phrases.
U of T’s Centre for Medieval Studies and the Documents of Early England Data Set (DEEDS) Project enlisted the help of Gelila Tilahun to develop software that would decipher their database of about 10,000 British charter and property documents, which are all from approximately 1066 until the 1400s, with the majority being from 1100 to 1300.
Tilahun, who was doing her PhD with the Department of Statistical Sciences, created the software that uses the DEEDS database as a source. When her software is given an undated text, it aggregates the probability of occurrence of words and phrases of the text at each time period, and then estimates the date of the text to be the time value that maximizes the aggregated probabilities.
Being able to estimate the probability of occurrence of a word or phrase means the evolution of the usage of popular terms can be examined. For example, the form of address “Francis et Anglicis” (French and English) was commonly used by French and English barons to address their workers and/or soldiers from the mid-1100s. When Normandy was lost to France in 1204 and the English no longer had tenure lands in Normandy, the form of address gradually disappeared from usage.
“The idea is that language evolves through time. Some words and phrases eventually die and others continue on,” she says. “These words have their own life. It’s amazing how we can decipher the date of a document based on the evolution of word usage.”
Dating these particular types of documents has proved to be a challenge. “The British never dated their documents. There are over a million documents in existence and nobody knows when they were written,” says Tilahun. “You can’t use writing styles or seals because a lot of these documents are not on their original parchment. These documents would go through different monasteries – people would come with a contract that links them to their property – but everything eventually deteriorated so scribes would actually continually handwrite new copies. Now all we have are the words.”
The goal is to get the software online so that anyone can submit a document to be dated. Tilahun says they are also working on modifying the algorithm to determine where the document is from.
Tilahun’s paper on the project, “Dating Medieval English Charters,” co-authored by U of T’s Andrey Feuerverger and Michael Gervers, was published in a recent issue of The Annals of Applied Statistics. Click here to access this article.
Source: University of Toronto