, , , , ,

Crowdsourcing is rather broadly defined term. First coined in 2006 as a portmanteau of the words “crowd” and “outsourcing”, it refers to “the practice of obtaining needed services, ideas or content by soliciting contributions from a large group of people” [1]. Crowdsourcing has been successfully used across diverse fields such as providing funding (Kickstarter), gathering and summarizing information (Wikipedia) and digitizing and archiving of books (Project Gutenberg). While there has been evidence of crowdsourcing as far back as the 1700s [2], it has rapidly gained popularity in the digital era as the internet brought together millions of potential contributors to any project from all over the world. Crowdsourcing has the potential to be a very powerful tool in the field of Digital Humanities research, and I will consider three projects in the field that used crowd-sourcing and compare and analyze them.

The first project abstract I examined was “From Crowdsourcing to Knowledge Communities: Creating Meaningful Scholarship Through Digital Collaboration [3]”.  This abstract is about a research project to consider the potential of using crowdsourced information for humanities research. One of the objectives of the project was to see whether crowdsourcing can be useful for humanities researchers in their projects, and did not look to answer basic questions such as whether it speeds up or cheapens the process. To examine this, the project involved three sub-projects in order to get a broader range of findings, and used three different types of crowds – the paid community, the expert community, and a group of people interested in the topic. However, the results from each of these crowds were only partially successful in meeting the project goals. The article also discusses so called ‘knowledge communities’. This term refers to those people who had very specific knowledge about a topic, who could be brought together to research towards a common goal. Working with such a community is a different variant of crowd sourcing known as community-sourcing. The study concluded that researchers should be working with knowledge communities as they are more focused to the needs of the researcher and can lead to original research.

The second project extract “What Do You Do With A Million Readers? [4]” provides an example of how crowd sourcing was used. Over the last decade millions of books have been digitized and been made accessible online. Concurrently millions of readers are posting comments containing reviews and summaries about these books. The project discussed in this abstract attempted to determine what type of information could be extracted about a particular book based on the crowd generated data about it. The author created a crawler that would text mine the reviews of a book to identify the main characters and items in the book, and determine the relationships between them. The author then evaluated the results using the Precision and False Detection Rate (0.6 and 0.13 respectively for The Hobbit) obtained by comparing the results to a gold standard. The results were then visualized in a directed graph as shown below.


Character/Object Relation Map determined through crowdsourcing

The author suggested that all of the user generated metadata for a book could be used not just to obtain information about the information in the book, but also provide insight into how people interact with what they read.

The third project abstract I examined was “Mapping the Emotions of London in Fiction, 1700-1900: A Crowdsourcing Experiment [5]”. While this project was not directly focused on crowd-sourcing, it did use crowd sourcing to achieve its goals. The author has created a geographic map of London, marking distinguishing areas and places in the city based on how the number of times a place is mentioned in literary texts. The author then goes one step further and attempts to map out the emotions attached to these places based on the literature. For this the author used crowd sourcing, and people were asked to read some of the texts which mentioned a certain place and either determine whether the emotion of fear or happiness was associated with that place. The results from this crowdsourcing was used to develop maps to represent the emotion associated with locations in the time period of the literary text. The author could then draw conclusions and identify patterns linking certain features of the city with certain emotions.

Emotional polarity of London 1800-1850. Green = Happy. Red = Fear. Grey = Neutral

Crowdsourcing is a powerful tool as it can deliver a large set of data to work with in order to analyze a problem at a faster rate than any one person on his own. While the abstracts I examined demonstrate both the successful use cases of crowdsourcing for the humanities research as well as some of the difficulties encountered, all three showed that crowdsourcing can be used in the field of digital humanities in different forms. In abstract [3] the researchers were rather unsatisfied with the results of crowdsourcing when they attempted to solve a non-primitive problem using crowd sourcing. However they felt using community-sourcing, one could use a collaborative approach that is focused on the needs of the researcher. Abstract [4] gave an example of how large volumes of data available from a crowd can be used to extract meaningful information. Admittedly, there is room for improvement in the precision of the data, but given the amount of ‘noise’ in crowd sourced data it is fairly accurate. Finally Abstract [5] provided an example of how crowd sourcing can be used by humanities researchers to get more subjective data. As different people have different degrees of emotional responses to a given stimulus it was important that crowdsourced data was used as the author could then get an average response for each region. Each of these 3 cases used crowd-sourcing in a different way, demonstrating its vast scope.

Crowdsourcing has the potential to be a powerful tool for Digital Humanities research. However, this potential depends on whether the appropriate technique is applied to the problem being tackled.


[1] “Crowdsourcing.” Merriam-Webster.com. Merriam-Webster, n.d. Web.

[2] “A Brief History of Crowdsourcing [Infographic].” Crowdsourcing.org. 18 Mar. 2012. Web.

[3] Voss, Jon, Gabriel Wolfenstein, Zephyr Frank, Ryan Heuser, Kerri Young, and Nick Stanhope. “From Crowdsourcing to Knowledge Communities: Creating Meaningful Scholarship Through Digital Collaboration.Dh2015.org. 1 July 2015. Web.

[4] Bandari, Roja, Timothy Tangherlini, and Vwani Roychowdhury. “What Do You Do With A Million Readers?Dh2015.org. 3 July 2015. Web.

[5] Heuser, Ryan, Mark Algee-Hewitt, Van Tran, Annalise Lockhart, and Erik Steiner. “Mapping the Emotions of London in Fiction, 1700-1900: A Crowdsourcing Experiment.Dh2015.org. 3 July 2015. Web.