, , ,

One can describe folklore as information and practices accumulated, by documents or mouth, through generations. What I will focus now is the literature. When we read classical novels, fables, mythological content from a geography, we can realize that these include lots of references to places, events, beliefs around that geography. A book might mention about a legendary mountain and the creature that lives there. After two hundred years, another might tell the real story of a shepherd lived there. After another two hundred years, today, what if we could be able to collect these information pieces from collections of (old) books and show them on accurate positions on a map, at the same time? There are already great contributions to the goal ‘literature digitizing’, such as ‘HathiTrust Digital Library’, ‘Google books’, ‘Archive.org’ and we have the tools for digital map making, such as Google Maps, OpenStreetMap etc.

A great example of this kind of work has been done in “Palimpsest: Improving Assisted Curation of Loco-specific Literature” project [1]. In the project, the team prepared a workflow that includes crowdsourced curating and usage of location specific gazetters to build a rich map of Edinburgh. They started with books that contains Edinburgh-specific literature from digital archives (HathiTrust, British Library etc.) After initial tries and feedbacks from annotators, they filtered the books down to literary documents contain a reference to Edinburgh or a variant, which hugely improved the progress. According to them, “… Scholars and the public can both geographically explore their fictional city” in the end, after processing approximately 380.000 documents.

Another attempt to link places with digital maps was done [2] in “ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus”. This project is focused on Denmark & its legends mainly and it offers complex queries on the map and dataset. The team have improved their preliminary research “Geo-semantic browsing of Danish folklore” (Broadwell and Tangherlini, 2014) by using a version of ‘Latent Geographical Topic Analysis’ and built co-occurrence matrices of places to topics, labels and keywords after processing more than 20.000 Danish legends. In the end, they were able to ask questions such as ‘Where are the elves?’ and ‘What can I find here?’ (Referring to a bounding box drawn on the map).

A different but large scale project [3] was carried out to map the relationship between patterns of literary attention and demographic and economic factors in the United States between 1800 and 2010. Even though the main goal was not only about folklore, this project gives hopes to similar approaches at this scale. For example, On 12 million volumes of documents, Named-entity-recognition algorithms were run to extract places, and Google’s Geocoding API is used to determine the coordinates. Gathered data was used to build city-level and country-level maps with specified filters.

First two article focuses on specific locations with a greater detail, the last project achieves a significant success by processing huge amounts of data and there are surely other attractive attempts on this field. We already have different kinds of location-based/map applications now that uses the most recent information such as metro’s timetable, best-sellers in our city, tweets from my neighborhood etc., but building map and visualization tools based on literature’s might help us understand the folklore itself better by showing how the tales migrated from one place to another, or how topics of popular stories changed during a specific period of history.


[1] Palimpsest: Improving Assisted Curation of Loco-specific Literature

[2] ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus

[3] Mapping and Modeling Centuries of Literary Geography across Millions of Books