, , , ,

Literature is today more than ever a very important component of culture, so one might wonder what would happen if we were to apply Digital Humanities techniques to contemporary literature works. Could we gain more insights on an author, a style or a particular novel if we were to scan and analyze its pages? Can we obtain some relevant information about literature analyzing the social media?
Some authors have tried to answer these questions and the trend appears promising. We’ll briefly illustrate three recent abstracts presented at the annual Digital Humanities conference in 2015.

Abstract [1] is focused on the Ulysses, masterpiece of Irish writer James Joyce. This opus narrates the life of Stephen Dedalus and Leopold Bloom through Dublin on one single day, 16 June 1904.
Although very intricate, the path followed by the protagonists is very well defined, both in space and time, so that it’s possible to actually draw a map of the characters’ position during that day. And this is what happens during the Bloomday, the annual celebration of the book, which takes place on 16th June every year: some wear Edwardian clothes, while others visit the places crossed by the protagonists of the novel and, especially in recent years, many post messages, videos and tweets on social media.
The authors monitored Youtube, Flickr and Twitter during 16 June 2014, and observed that people in Dublin tended to gather around site locations at various moments, not just the novelistic times, and that Byrne’s pub site was more visited than the National Library site even during the latter time period.

In the second abstract [2] 270 French crime fiction novels written by 14 authors and published between 1858 and 2012 were analyzed in order to explore possible relations between authors and their texts: could a novelist be more related to a certain topic or subgenre than the others? Do topics remain constant over time?
To answer these questions, the topics were modeled as a probability distribution over word frequencies, where each text is characterized by a distribution over topics.
Out of the 60 topics found, some were common to all literary genres, while only 9 were related to crime fiction, making it a less specific genre than firstly supposed. Furthermore, it resulted that some authors (but not all) had a distinctive topic with a higher score compared both to other topics of the same author and to the same topic for other authors. Some topics also increased or decreased in importance over time, such as “twilight”, which score decreased as time passed in the traditional detective fiction novel, but remained constant in the “roman noir” subgenre.

In abstract [3], the AusLit virtual research environment for Australian literary culture is presented. This database contains thousands of texts related to Australia  ranging from 19th to 20th century. The newest project undertaken in 2014-15 consists in expanding the database by including literary text published on an archive of digitized Australian newspapers. In fact, in the period covered by the newspaper, they were the main form of literary culture transmission.
To find literary works in the immense amount of words, an automated approach was used: the newspapers were scanned with regard to all words used in available works of poetry. As a test, the authors ran the poetry identification algorithm on a set of already existing poetry works in AusLit, resulting in over 80% of correct identification. A manual examination of the rejected articles revealed that many of them resulted in being a small amount of poetry mixed with a lot of prose or texts with poor OCR in which only a few words were correctly identified.
The declared aim of the project is to collect information about Australians reading habits before the 20 century and ultimately to create a literary map of Australia’s colonial period.

The three articles analyzed presented some very different approaches to digital humanities and literature: although the first one is based on a specific novel, only general information of its content are needed, while the focus is on creating a social map based on people’s behavior on social media on the book’s commemoration day.
In the second article, instead, all the considered novels are thoroughly analyzed to obtain a list of topics and to relate them between authors, time or subgenre.
In the last article, a device to recognize relevant literary works among a sea of data is developed, with the aim to create a picture of people’s cultural patterns to ultimately better understand Australia’s history and its colonists.


[1] Charles Bartlett Travis, A Digital Humanities GIS Ontology: Tweetflickertubing James Joyce’s ‘Ulysses’ (1922), 2015.

[2] Christof Schöch, Topic Modeling French Crime Fiction, 2015.

[3] Kerry Kilner, Discovering and Rediscovering Full Text: Unearthing and Refactoring, 2015