One of the goals of digital humanities is to apply techniques from the digital world to the humanities in order to make its exploration more tractable and efficient. However, in order to achieve this goal, a systematic and objective approach is needed, so that myriad of documents may be automatically explored and analysed by computers. This requires a format in which the information can be universally translated. Such a universal format is very difficult to establish when it comes to literary. In fact, we encounter multiple issues. One obvious and major obstacle is the language in which the documents are written. But even if we restrain our attention to a given language, then the style chosen by the author as well as the epoch of writing represent huge challenges for an automatic extraction of the underlying information. Consequently, one needs then to re-express this information in a common framework, freed from cultural and lingual artefacts. Maps might provide such a framework. Indeed maps take root in a non-subjective environment common to any human being: the geographical world. Then, when building a map, one is constantly seeking for as much objectivity as possible: the quality of the map will be judge according to its accuracy and capability in representing geographical information. Of course, rules may differ from one map to the other but these rules are always very specifically stated within the map (on the form of a legend for example). However, in order to make this map building as homogenized as possible, a map language, so-called semiology, exists and the authors of maps are encouraged to use it: this prevents the author from one of the main pitfall of information visualization, which consists of drowning the reader in a flood of wrongly presented information.

Once these mistakes are avoided and the internal rules of the map specified, one navigates in an objective environment, very well suited to computers and digital techniques, that can then be used to extract and analyse the information displayed on the map. Then, finding, digitizing, processing and building maps seem tasks of crucial interest in the field of digital humanities.

In this blogpost, we first investigate how history can be explored through maps, and which tools can help us in finding such historical documents over the web. Then, we will take another avenue and explore ways to extract the information carried by a text and how to represent it on a map. Finally, we will finish our exploration by looking at examples of less standard maps, demonstrating that mapping techniques can be easily and efficiently extended to non-geographical contexts.

A first approach in exploiting the rigorous framework provided by maps in the field of digital humanities is to give old historical maps a new life in the digital world, by means of digitization. This is a crucial step in the process of information extraction: once translated in bits and understandable by a machine, these maps can be processed and analysed at a very large scale in an automatic fashion. But surprisingly, this step is not the most difficult: in order to digitize a map one needs first to be able to find it! Indeed, in the age of communication and massive database it is paradoxically very difficult to find such documents. Classical search engines such as Google or Yahoo fail to index properly these maps, as the information they contain is not organized as classic web pages. To capture the logic of information displaying within maps and efficiently index them, one needs a tool specifically tailored to this task. This is typically the kind of tool Pridal Petr and other researchers are attempting to create through the project Old Maps Online (www.oldmapsonline.org) [1]. This web-based tool is a catalogue of historical maps, searchable and downloadable by the user in different numerical formats. The documents are organized onto a virtual world map, through which the user is invited to navigate in a very user-friendly manner. The search interface allows the user to search by typing a place-name or by clicking in the map window, and narrow the search by a date range. The search results provide a direct link to the map image on the website of the host institution (http://project.oldmapsonline.org/about). Moreover, in order to expand this database as much as possible, the map libraries and collectors are also incite to participate.

Figure 1 is an example of a map that can be found on this website. It consists of a map of Venise from 1752, representing various information about the city at this time: canals, churches, bridges, islands, neighbourhoods… The information is displayed on the form of drawings, annotations but also colours and polygons, that help in delimitating the neighbourhoods, and which could be efficiently detected and analysed by an algorithm trained to this task.

Map of Venice extracted from Old Maps Online.

Another avenue consists in directly building maps from historical documents. This is a slightly more involved process as it requires a total interpretation and reshaping of the information carried by the written medium. In fact, free form text and its inherent variability in style and form represent a real challenge when it comes to extracting automatically meaningful information from texts (such as names, places and dates). In the Mapping Colonial Americas Publishing Project [2], Jean Bauer et al. are performing such a task by constructing maps aiming to visualize what kinds of works were published in the Americas and how these printing patterns changed over time, on a dataset of 140000 records from a Brown University’s library catalog. More specifically, the project as it is describes in [2] proposes two major end products: a gazetteer of normalized and geolocated places of publication in the Americas before 1800 and a series of data visualizations designed to help scholars and students explore printed material on the axes of genre, place of publication, date of publication, and format [2].

An example of a such a map is provided in figure 2. This map shows the location of printing presses in North America whose products appear in the catalog of the American Antiquarian Society (AAS) during the period from 1639 to 1807. The blue dots are proportional to the number of materials held at the AAS. The map gives a visual idea of the sites of pupblishing along this period. In our context, it is interesting to see how intuitive the reading of this map is. Indeed, one can easily guess that the size of the circles is proportional to the amount of publications and that some regions (cities) appears in to published much more than others. Thus, the reader of these maps can easily observe the regional differences in reading and publishing habits (however, all the information available is not accessible on this screenshot as the maps are interactively reacting to the mouse of the user: http://cds.library.brown.edu/mapping-genres/). This observation really nicley illustrate the universality of this meduim : one needs not to be a specialist in the subject in order to be able to understand and analyze the information contained on this map.

Figure 2: Location of printing presses in North America from 1639 to 1807.

If [2] is mainly interested in tracking historical documents in space and time, such an apporach can also surprisingly be extended to apparetly more « exotic » fields, such as literary. This idea is discussed in a paper written by Lynch, John, Kurtz Wendy, Rocchio Michael [3] who are mainly interested in fictional spaces depicted in literature and ways to visualize them via interactive maps (with tools such Google maps, GIS software, … ). They claim that such tools can help students and scholars in their study of the books, by putting in perspective their perception of the places described, confronting their own abstract representation of the fictionnal space to the objectivity and rigorousness of the maps built from the tool.

In conclusion, it appears that maps are very convenient tools for scholars in digital humanities, and this at any step of a research work. First, they permit to perform exploratory data analysis on a huge quantity of documents, efficiently summarized on a single medium. Such an analysis can be source of various hypothesis and inferences that can lead to fruitful investigations. But maps are not only an exploratory tool: once the information is expressed in a suitable map environment, one can invoke statistical tools on the data, in order to try and answer questions inferred by the exploratory step. Finally, maps are also a powerful tool to present results of a research study in a visually appealing fashion.

 Nevertheless, if maps can reveal to be an attractive tool with numerous advantages, they have to be carefully realized in order for them to be exploited at their maximum capabilities. Indeed, it is a powerful medium but also very dangerous if the map is badly established: too many information presented, bad semeiological choices. If it is done correctly, a large kind of things can be represented. Thus, one understand that the realization of a map is not only a geographical medium but it requires a real reflexion in this visual representation in order to make it as much intelligible as possible. In this context, one can even free itself from the geographical space, when the application legitimates it. Usual mapping techniques may be extended to less restrictive spaces, allowing to represent any kind of information. As an example, we can cite once more the Mapping Colonial Americas Publishing Project with the map provided in figure 3. This map (also called bubble chart) presents the relative prominence of different genres published in North America from the earliest British colonial settlements to 1807.

Figure 3: Genres in North American Publishing, 1639-1807

We see here that the interesting information lie in the number of publications in a given genre, and not so much in their place of publication. Spreading the information through a geographical space would have considerably complicated the readability of the map, as the reader would have had to mentally group publications of same genre but different places of publication.


