In the era of WYSIWYG (What You See Is What You Get), visual data has never played a more important role in the study of humanities than today. Not only do humanities researchers collect digital-born data like televisions, films and photos, but also massively digitize physical sources like handwriting manuscripts, woodblock prints, paintings and etc, which has made the study of humanities enter the digital age. However, without an efficient search engine that helps researchers to find what they seek, these huge amount of digital visual data would be rather useless. There are two major types of image retrieval methods: one is based on the metadata associated with the images, the other is based on the content of the images, namely features of the objects in the images[1]. To improve the retrieval accuracy, both types of methods should be used to complement one another.

Two themes were discussed in the workshop proposed by Roeland Ordelman et al.[2], one of which is about automatically generating metadata for better image search. The metadata of an image may include the image size, the image resolution, the created date and some indices (or tags) added manually[1]. Nonetheless, in this case, a rich and professional metadata is required in order to obtain a desirable search result. Given the size of visual data archive is usually enormous, attributing these metadata to associated images manually seems impossible. Developing a system that can generate accurate metadata for associated images will help to solve this problem. Carl Stahmer [3] has developed such a C++ application called Arch-V, which can indexing images automatically based on their contents. At first, Arch-V extracts features from a training image set and uses these features to create a Visual Dictionary. Later, every image in the archive other than the training image set goes through the same process of feature extraction, only this time features are represented in a normalized histogram fashion; this histogram will be used to index the image by matching it with the Visual Dictionary.

The larger the size of the digital archive is, the more difficult it will be to find a certain image in it. In a way, several local archives are better than one global archive in terms of searching efficiency as well as storing capacity. Similarly, several categorized archives are also better than one generalized archive. So there may exist various archives built by different research institutions under the same category of the same subject. However, the metadata of the same image in different archive may be incongruous. Take woodblock prints for example, one institution may give an analysis result that a print is about 1000 years old, while the other institution with a result of around 500 years old; a similar disagreement may also happen to the content of the woodblock prints. When two institution attribute different metadata to the same image in their own archives, one cannot find the same print in different archives using the same metadata to search for it. Doug Reside [4] points out the information without any disagreement is the image itself. Searching for a print with its image in different two archives will still give us the detailed information of the same print. Also, when researchers have very little or zero knowledge of the metadata of an image, the best way to search for its information in an archive is searching based on the content of the image. An example of theater photographs is also presented in [4].

We cannot really say that the image retrieval based on metadata is better than content-based image search, or it is the other way around. By combining both methods, the image search engine allows researchers to search for what they seek in the archive in a more flexible way. If a researcher have an image in hand, then he/she can certainly search for it with image; otherwise, he need to know about some metadata of the image. A better image retrieval result will be given when researchers search with both images and associated metadata.


[2] Ordelman, R., Kemman, M., Kleppe, M. & Jong, F. de. (2014) Sound and (Moving) Images in Focus – How to Integrate Audiovisual Material in Digital Humanities Research. In Proceeding of Digital Humanities Conference. Lausanne, Switzerland.

[3] Stahmer, C. (2014) Arch-V: A Platform for Image-Based Search and Retrieval of Digital Archives. In Proceeding of Digital Humanities Conference. Lausanne, Switzerland.

[4] Reside, D. (2014) Using Computer Vision to Improve Image Metadata. In Proceeding of Digital Humanities Conference. Lausanne, Switzerland.