, , , ,


The synergy between Information Visualization and Digital Humanities is one that can produce a larger depth of analysis and a more profound understanding of the resulting data, especially when using large datasets. Visualization techniques and their respective results differ both in form and in effectiveness. Thus, there is a need to search for the most appropriate technique for implementation, depending on the dataset in hand. Here follows a brief presentation of three abstracts that deal with Information Visualization and its application on Digital Humanities in the big data era.

1. On the Use of Visualization for the Digital Humanities

The focus here is on the development, the impact and the key role of several visualization and interaction techniques in digital humanities applications. The noticeable research in this domain is highlighted by statistically analyzing the use of visualization in the DH2014 contributions. Non-interactive models such as 1-dimensional (e.g. word clouds and histograms), 2-dimensional (e.g. scatter plots and time lines) or multi-dimensional (e.g. 3D maps and 3D building rendering) are mainly used for data visualization. In addition, relational models (e.g. uni- or bidirectional relations using graphs or hierarchical relations using trees) are used in 39.6% of the contributions, whereas 5.1% use animations to facilitate the understanding of complex data. The most common interaction is the one of abstracting/elaborating; zooming operations belong to this category and allow the user to see more or less detail. Such interactions are called details-on-demand operations and are used from 60.3% of the contributions. With exploring operations, users have the possibility to see another area of a map, a graph or a tree. Filter interactions are considered to be the third most common interaction technique. The evaluation of visualization and interaction techniques by the digital humanities’ research community is of key importance for the quantification of their usefulness and usability. However, only 6.8% of the DH2014 visualization applications have been evaluated so far.

2. Exploring Large Datasets with Topic Model Visualizations

Visualization as a tool for large dataset exploration based on Topic Model Visualization could prove itself to be of promising potential in exploring large datasets and is constructed in two stages. The first step is achieved with a command line tool, such as MALLET, that produces exchangeable text files. The second step involves either general visualization tools or topic model specific visualization tools. However, none of the aforementioned type of tools manages to serve large dataset exploration. Topic Modeling Visualizations are mainly categorized into traditional charts (bar, scatter plot, etc.), network graphs, zoomable tools and 2D matrices. However, these tools are application dependent. Non-graph based tools suitable for underlying data exploration and connection reveal exist, but they are more or less confused by large datasets. The first experimentation with MALLET pre-processing and chart-based visualization failed to bear satisfying results as the latter could not reveal all the underlying connections (e.g connections between topics and relations between journals). Hence, the need of a graph-based tool that explores adequately large datasets arose. Focusing on visualizations that ameliorate understanding and analysis of datasets combined with extraction of new information, there was a turn to 3D displays that would prevent information loss as much as possible. The development of a new exploration tool is based on the use of the JavaScript framework Famous.us as it is easier to implement and along with other features helps provide with interactive 3D environment of better processing quality. Moreover, by implementing the notion of “knowledge discovery”, filtering nonessential data results in a finer exploration experience, easily manipulated by the user. By mixing alternative visualization tool types and by fine-tuning the user could customize the visualization output. Finally, a combination of visualization techniques is offered through a Zoomable User Interface that provides a high level summary and creates a gradual interaction with the data.

Zoomable operations on datasets enable revealing of detailed information

Figure 1: Zoomable operations on datasets enable revealing of detailed information [2].

3. Interactive Visual Analysis of German Poetics

Multiple visualization techniques are getting more and more popular in the big data era, as means of analyzing and understanding large datasets. Moreover, the use of interaction can overthrow the static depiction and errors, inevitable with automatic processing. Interaction enables the user to also navigate through data and check how accurate and valid they are, facilitating distant and close reading; a useful feature in analyzing text documents. Such an interaction can be managed through amelioration of the analytic process and the development of a more advanced visual representation and analysis. The project that aims to this scope through text document is called ePoetics and is constructed on a multi-layered interactive visualization method, enabling both depiction of overview and detail, as long as analysis and comparison between different text sources.

Figure 2: Multi-layered data depiction: The deeper the user goes, the more information is revealed

Figure 2: Multi-layered data depiction: The deeper the user goes, the more information is revealed [3]

The core of the project is the visualized and analyzed depiction of a number of German Poetics, which constitute literary commentaries through a double-keying development. The hierarchical and navigational provision of this visualization concept helps the researcher to look though specific documents by form or request, enabling also distant and close reading on multiple abstraction levels. This abstraction is managed through a variety of pictograms and charts that offer appropriate foundation for lengthier analysis and development of ideas. Simultaneously, the addition of facsimiles provides the analyst with information, that is not textual, such as part of manuscripts or images. This approach offers a more flexible mixing of visual abstractions, able to summarize the most important data. For example, a user may choose a search keyword and enable pictograms for relevant passages navigation and citations identification. However, a second approach emphasizes on the process of a multiple texts analysis at once. This focus+context method offers a downright easier way to switch between distant and close reading and to compare different texts from different authors or periods. Lastly, positive feedback encourages to pursue a comprehensive analysis approach, encompassing automatic processing that merges text with other data, in hopes of providing a beneficial visualization technique for literary works.

The Texts through comparison: A New Trend Arising

The first abstract is a general introduction to the use of visualization techniques in Digital Humanities. The last two abstracts focus on the use of information visualization on large datasets in the age of big data. Comparing the above three abstracts, two levels of conclusion arise.

  1. Similarities: All three abstracts focus on the importance of data filtering in visualization techniques. Also, they converge to the idea of developing a more interactive user interface and create an environment of information visualization that is more user friendly and easily manipulated. Based on this idea, the structure of the information visualization has to be multi-layered and facilitate zooming operations i.e. enable the display of more or less information. Moreover, the use of 3D models is highlighted as an easier and more understanding way for information visualization.
  2. Exceptional information: The second abstract is uniquely mentioning the importance of information loss prevention. Moreover, the techniques presented in the third abstract concern mainly text mining and analysis.

A new trend in visualizing data arises as the volume of information increases and has two main features. The first feature, namely user interaction, helps in overcoming static depiction and preventing errors not visible through automated processing. The second feature is based on a multi-layered architecture for data visualization. The upper layers should give the user an abstract representation of the available data by hiding unnecessary and probably confusing information, whereas the deeper layers must provide detailed information about specific data according to the user’s needs. The latter is mainly implemented by using filtering and zooming i.e. a detail-on-demand operation. Moreover, all the above techniques should meet the need for information loss prevention. Independent of the chosen visualization technique, the new trend dictates that a combination of distant and close reading must be satisfied to serve the needs of Digital Humanities Research in exploiting large datasets. Last but not least, the emerging trend includes evaluation of visualization techniques in terms of effectiveness is of utmost importance in order to benefit Digital Humanities, enhancing human cognition on data processing and analysis.

All in all, the increasingly developing data load of Digital Humanities requires an up to date and competent technology in order to unleash its full potential. Dynamic interactive depiction of information and the creation of a user friendlier environment are sure to benefit the analyst, drawing new horizons for information analysis and development of new ideas.