, , ,

The digital images obtained from the digitization process of culture and heritage materials can be analyzed and classified by image processing techniques to reveal content, feature or highlight pattern of the material. With the help of advanced processing tools and algorithms, the large amount of heritage data expressed in visual form can be automatically sorted, manipulated and managed, which may significantly facilitate our understanding and analysis of cultural objects.

A huge amount of knowledge of humanities is presented and bequeathed in the forms of books, periodicals, and manuscripts and today it can be scanned to become millions of digital images. Thus, it could be of enormous benefit if there is a computational tool that automatically extracts text from images so that we can easily store, manipulate the content of the text. Such a tool is called TILT (Text-to-image linking tool), which is particularly designed for dealing with curved and slanting texts. In practice, TILT can be used for both word and line recognition in manuscript. For word recognition, when an user clicks on a part of the page, the surrounding area will be investigated in each small square unit (Figure 1, left).  One square will be accepted if its average darkness is greater than the whole document’s average darkness and otherwise, it will be divided into four sub-squares.  The adjacent squares of accepted ones will subsequently assessed too and finally, the word-boundaries (in Figure 1, right) can be rapidly determined by a bounding box (solid outline) or even a polygon (dotted outline).

Figure 1: Illustration of word-boundaries determination

The line recognition works in the same mechanism as the baseline of the text can be detected by taking average darkness along each row of the image. One row is considered as a line if its average darkness exceeds the whole background’s. Note that in practice, to increase the accuracy of word-shape detection, the page is often divided into lines first then the word can be determined and interpreted in each line locally.

Figure 2: Victorian poem displayed in two pages

Not only at the level of word and line determination, the structure of the page can also be extracted by image processing techniques to reveal visual signal of surprisingly Victorian poem (see Figure 2). At the higher level than TILT, a software named VisualPage can discover and recognize the distinctive pattern of printed poems, which are typically surrounded by while space with line ending. By observing that in Victorian books of poetry, the poem’s form with equally indented lines can indicate the genre, for example. This is actually a psychological fact as suggest by Johanna Drucker.

we see before we read and the recognition thus produced predisposes us to reading according to specific graphic codes before we engage with the language of the text’. – Johanna Drucker

The operation of VisualPage can be summarized in three step:

  • Feature extraction step analyzes the structure of the page such as typeface size, margin size, text lines… The software should be able to detect these structures in the context of the bibliographic categories, such as chapter, volumes or pages. Once the layout of the document is well-understood, we can easily find answers for these kinds of question ‘how much variability is there in the length of lines in poems from two different publishers?’ or ‘how does the visual density of a page change for this publisher over time?’ .
  • Pattern recognition step can find the query related to pattern such as “find all poems that use dropped capital letters’ or r ‘find poems whose line length is in the bottom 25% of poems from this publisher’.  By grouping similar pattern into different clustering, VisualPage can also be evolved by integrating more advanced machine learning algorithm, which allows the software to learn from users’ need.
  • Analysis step presents the information extracted from the structure and the pattern recognized. It also compare and connect relationship between different documents.

The above three inter-related tasks can allow large-scale analysis and understanding of heritage books and documents without soly basing on human’s limited capacity. Only by understanding thousands of books at the time, we can complete such a task like revealing significant designing trend and style of Victorian books or identifying historical changes corresponding to a specific publisher or author.

We have looked at two applications to automatically analyse and understand heritage and cultural document but in fact, image processing technique can also benefit digital humanist in managing and evaluating digitized paintings. The question is after all the paintings are digitized and then circulated in the Internet, how can we still evaluate and analyze its usage and impact. Researchers from University College London have explorerd a tool named Reverse Image Lookup (RIL) to track the circulation of only digitized paintings.

In more details, two samples of painting from National Gallery are chosen together with a random sample of six paintings of different authors to be tracked and analysed. Besides using RIL, TinEye and Google, two other tools for image search are also utilized. The result of investigating online heritage paintings shows that the most popular painting (based on the number of access) is also the most commonly used in other place. Even this results is somehow not a breakthrough but it can be a motivation for developing a well-tailored tools for tracking, analyzing and evaluating digital resue of heritage content rather than based on RIL, TynEye or Google image search, which are not particularly designed for this purpose.

To sum up, as illustrated by three applications above, this article aims to reflect the promising results of applying image processing techniques in understanding and analyzing cultural and heritage content. This trend is indispensable given the increasing demanding of large-scale analysis of millions of pages, pictures and books, which are almost impossible for limited human capacity.


[1] Text to Image Linking Tool (TILT) http://dh2013.unl.edu/abstracts/ab-112.html

[2] Reading the Visual Page of Victorian Poetry http://dh2013.unl.edu/abstracts/ab-274.html

[3] Reverse Image Lookup http://dh2013.unl.edu/abstracts/ab-243.html

[4] http://www.springfieldlibrary.org/gutenberg/art.html