, ,

As the digitization of large newspapers archives is becoming more common place, scholars in the Digital humanities field have started taking advantage of this data. Historically the use of newspapers archives has been laborious, as the volume of the information present was very large to be able to effectively search or reference the entire archive. The three papers reviewed in this blog show different uses of digital archives and help highlight a current trend that is taking place in Digital Humanities.

Bode [3] uses the National Library of Australia’s Trove Database. This digital archive has over 4 million pages of Australian newspapers from 1803 to 1950s. The focus of Bode’s research was works of serial fiction that appeared in newspapers. Using the Trove’s search with terms that are usually associated with serial fiction, such as “Chapter”, “Story”, and “Fiction” she was able to identify possibly relevant records in the archive. Using this information it was possible to then extract bibliographic data from the works of fiction.

Initial data analysis has already allowed her to challenge existing perceptions of Australian literary culture: the idea that metropolitan newspapers were the primary vector of publication serial fiction. The study demonstrated that regional newspapers had a strong participation in the publication of said fiction.

Huistra et al [1] used the Dutch nation library’s digital archive in the scope of their study. The focus of their study was the formation of the identity of at risk groups, in this case people with a Body Mass Index (BMI) of 25 or greater, in the public discourse. Using resources developed in other contexts (like Texcavator and Delpher) they were able to identify articles in relation to “overweight” people and the way the author of the article described them, allowing the researchers to have an idea of cultural view for this group at that time.

Eisenstein et al [2] used the digital archive of the “Anti-slavery Bugle (ASB)”, published in New Lisbon, Ohio. Their study had a few differences in that they were looking to answer different sort of questions than the ones proposed in the previous papers. They were not interested in doing keyword searchers as their questions were more open ended, for example “Did the women’s rights movement borrow language from the nation’s contemporaneous anti-slavery campaign?”. The method used is called Exploratory thematic analysis (ETA), and it helped them identify the different “topics” that were present in the Anti-slavery Bugle. They also proposed an innovative visualization tool for this data as can be seen in the following image.

While their results can not be generalized, and may have some overly simplistic assumptions, the main objective of their study was to propose new algorithms and visualization to this new area of research that allow people to better understand and make use of digital newspapers archives.

With these three papers in mind we can see that the digitization of newspapers has allowed researchers to pursue traditional methods of research but with greater ease and scope, as was the case for the first two, or allow them to use innovative methods to research new questions.


  1. Using digitized newspaper archives to investigate identity formation in long-term public discourse (http://dharchive.org/paper/DH2014/Paper-329.xml)
  2. Exploratory Thematic Analysis for Historical Newspaper Archives (http://dharchive.org/paper/DH2014/Paper-921.xml)
  3. Mining a ‘Trove’: Modelling a Transnational Literary Culture (http://dharchive.org/paper/DH2014/Paper-63.xml)