, , , , , , ,

People have always been interested in not only events that happened in the past but also in the lifestyle of the people who lived before us. Understanding ideas – political, economic, scientific, cultural – and relationships between people who once had similar (very often the same) problems that we face today, helps us better understand ourselves and the world around us.

Before computers were brought into our lives, scientists had been analysing written documents – testaments, work papers, but also letters, diaries – in order to more easily understand interactions between people, as well as influences that they had on each other in the past. However, researchers could not get the clear picture since these documents often depended on the subjective view of the person who wrote them. Development of computers brought digitalization of data as a new trend. Rapid evolution of the computers in last few decades made them capable of performing very complex algorithms. So, with plenty of digitalized information, scientists realized that they could use computer algorithms for performing an objective analysis of people’s lifestyle from the past.

Analysis is usually done on images, emails, books, etc. and algorithm used for this is divided into three phases: aggregation, conversion and analysis. During the first phase, algorithm identifies and annotates recognized names, special words and other resources within the text, image, etc. After that, during the conversion phase, annotated objects from the first phase are linked to the authoritative information source database to provide scholarly identifiers. Moreover, these identifiers are only saved on the local database because this prevents data from easily being added to authoritative information source database, which needs to be controlled. During the final phase algorithm creates graphs of the network of those assigned annotations and, usually, scientists are allowed to inspect and tune the result of this phase. This algorithm is used in research in many different branches of science. Some of its applications can be found in abstracts that are given in the reference at the end of this text.


In research from the article “Research to clarify the interrelationships between family members through the analysis of family photographs”[1], this algorithm is used for discovering interrelationships between family members during the nineteenth and the beginning of the twentieth century. In those years, aristocratic families had a habit of taking pictures at their regular ceremonies, so they left many official photographs. These photographs were digitalized and used as an input in forgoing algorithm. The result was a graph, where nodes represented each member of the family, and edges connect members that appeared together on the photographs. From the structure of graph and the number of edges between two members, researchers could find a lot of things about the relationship between them.

Graph showing the interrelationships between the Japanese family members

Similar application of the algorithm can be found in the article “Networking the Belfast Group through the Automated Semantic Enhancement of Existing Digital Content”[2], where the input was literary works from different famous poets from Ireland. The output of the algorithm was the graph where nodes represented poets, whereas edges represented the similarity between their works. This similarity was found by using the correlation function on keywords or n-grams (contiguous sequences of n words) from these literary works. Therefore, scholars used this graph to discover what influence poets from Ireland had on each other. Very similar to this can be found in the article “Uncovering Reprinting Networks in Nineteenth-Century American Newspapers”[3]. In this research, scholars used described algorithm to analyse American newspapers and other available texts from the nineteenth century to find out what were the viral ideas that circulated in the public sphere in those times.

Graph showing the poetic activity in Ireland in the middle of the twentieth century


Catalogues and belongings of museums, galleries, archives and libraries include traces to the hidden part of the history. With the expansion of computers, scientists find a way to use these traces to discover a lot of interesting facts about the past. Therefore, we can see that using the algorithm that was described in this text has become a trend.


  1. Togiya, Norio (2013). Research to clarify the interrelationships between family members through the analysis of family photographs. July 18, 2013. (http://dh2013.unl.edu/abstracts/ab-332.html).
  2. Koeser, Rebecca Sutton; Croxall, Brian (2013). Networking the Belfast Group through the Automated Semantic Enhancement of Existing Digital Content. July 17, 2013. (http://dh2013.unl.edu/abstracts/ab-357.html).
  3. Cordell, Ryan; Maddock Dillon, Elizabeth; Smith, David (2013). Uncovering Reprinting Networks in Nineteenth-Century American Newspapers. July 17, 2013. (http://dh2013.unl.edu/abstracts/ab-150.html).