With the progress of computational capacity and the innovation of new mathematical models, nowadays machines are able to uncover intrinsic patterns in literature that were not observable to human beings, to classify different types of art works without intellectually understanding the contexts, or to simulate a historical phenomena which enables researcher to identify interesting properties with a new angle. Here, I select 3 abstracts from Digital Humanities 2013 to illustrate how machines (computers) can extend our understanding on humanistic subjects.
Learning from Data – Literary Allusion Detection
Thanks to recent effort to digitize numerous literary works, lots of emerging studies of Digital Humanities stem from deriving new/better interpretations from data, i.e., data and text mining. The first abstract, “Modeling the Interpretation of Literary Allusion with Machine Learning Techniques,” tries to use Machine Learning/Data Mining techniques to improve the performance of automatic detection of “literary allusions”(one text makes a reference to a passage of another text). Their benchmark dataset consists of 3,400 pairs of sentences extracted from classical literature. The main task is to rank these pairs for their significance in making allusions, i.e., to rank the most significant “allusioned” pairs with score 5 and the least with 1. Previous works show moderate results by using simple features of sentences. In this work, the author instead chooses a more complex feature set which includes bi-gram frequency, frequency of individual words, and edit-distance. Using Machine Learning algorithms such as SVM and random forest, the result shows a promising allusion detection accuracy of about 82% AUC score, which suggests that allusions has quantifiable patterns and computers are able to recognize most of them in literature.
Learning from Data – Play/Movie Property Classification
The second abstract is also about mining from data with Machine Learning techniques, but with a relatively huge scale. “Extraction and Analysis of Character Interaction Networks from Plays and Movies,” conducted by 3 researchers of Stanford Univ., deals with a large-scale problem: teach the computers how to discover various characteristics in plays and movies across genres and over time. Their strategy is as follows: building character interaction networks, extracting network properties as features, and then training various classifiers to recognize corresponding media aspects, such as Play v.s. Movie, or Pre-1800 Play v.s. Post-1800 Play, etc. Their result is shown below. The most prominent contribution of this project is that, it provides valuable insights for different properties within plays and movies on basis of machine intelligence instead of human scholars. By example, from the exceptionally accurate classifier which recognizes plays from movies, the researchers find out that this is because “plays tend to have one distinct main character with several supporting characters that interact primarily with the main role, such as Hamlet,” while movie characters usually do not interact in the same way.
Alternate Approach from Data Analysis: Agent-Based Modeling
Other than making machines learn from data as the two previous researches, a professor from University of South Carolina proposed an alternate approach to conduct humanity studies: Agent-Based Modeling (ABM). Instead of analysis of text corpora, ABM creates a simulated environment and deploys several agents, and then measures the interactions of them within the environment. In other words, for some long-standing humanistic questions, such as how literary genre changes over centuries, ABM might provide viable answers by “showing how the behaviors of individual entities collectively alter large emergent phenomena.” The author is currently undertaking a project which applies ABM to simulate English print culture of 17th century. The future outcome may renew our viewpoint toward some questions like: “how do readers decide to purchase books?” or “how does censorship affect controversial political writing?”
The 3 abstracts demonstrate high potential of analyzing humanistic subjects with machine intelligence. It has been proven that computers are capable of recognizing literary allusions, summarizing characteristics in literature and movies across time and genres, and providing alternate answers to historical phenomena by simulation. Machines are getting smarter and more powerful. New technology always stems from the unsatisfiable needs of human beings, and so it should serve humanities well.
 Modelling the Interpretation of Literary Allusion with Machine Learning Techniques (http://dh2013.unl.edu/abstracts/ab-177.html)
 Extraction and Analysis of Character Interaction Networks From Plays and Movies (http://dh2013.unl.edu/abstracts/ab-251.html)
 Agent-Based Modeling and Historical Simulation (http://dh2013.unl.edu/abstracts/ab-114.html)