Natural language processing is a subject on the interactions between computers and human languages. This topic does not only concern the computer science but also with the artificial intelligence and linguistics. Enabling the computers to derive the meaning from human or natural language input is one of the challenges in NLP and it becomes more and more useful in many applications.
The natural language processing can be involved in many research activities which directly applied into the real world and also can be considered as sub-tasks to aid in solving lager tasks. In this paper we will introduce how the NPL will be used in medieval charters information extraction, citation studies in the humanities and also a program named CULTURA, which supports the professional humanities research.
For one example, for the cultural heritage across Europe and worldwide, there is one important issue about the interrogation of growing digital humanities collection. After the digitization of these heritage treasures, it is quite difficult to fully realize their values because of difficulty of navigation. CULTRURA is a program aiming to provide the researchers with personalized mechanism to do collaborative research. As said in the paper, “a central aspect of the CULTURA environment is its use of rich metadata”, researcher can simply use one operation in CULTURA which can interprets words and combination of words to identify entities in the corpus such as people, place and dates. This process which will save the researcher a great deal of time is implanted by the NPL.
Moreover, the ChartEX (Charter Excavator) project is aiming to develop an innovative way for the researcher to make use of the medieval charters by using both natural language processing and data mining techniques. The NLP component can determine which words refer to the same objects in the context of given sentence or part of text; for this project, “the NLP can generate automatic markup of entities in a larger corpus of charters”. By use of the output from NLP, data mining can extract the relationship between these entities.
Figure 1 George’s charter from England
Another example of NPL is about the citation studies in the humanities. This tool uses the natural language processing techniques to arrange these citations in the order of frequency, location-in-document and polarity. From the scatter plot of the results of the three categories, the disciplinary difference in frequency and polarity is quite obvious to see. In the future, the tools will clarify the resolution of polarity scores, examine the power of the crowdsourced classifications and provide new document layout.
Figure 2.Citation location-in-document and polarity
These three papers only introduce some specific usage by the NPL. For the future, it is possible that human-level NPL will make computers as intelligent as people. It will learn from the information by use of internet and apply what they find to solve the real world problems.
- ChartEx: a project to extract information from the content of medieval charters and create a virtual workbench for historians to work with this information.(http://dh2013.unl.edu/abstracts/ab-431.html)
- CULTURA: Supporting Professional Humanities Researchers.(http://dh2013.unl.edu/abstracts/ab-225.html)
- Citation studies in the humanities.(http://dh2013.unl.edu/abstracts/ab-353.html)
- Natural language processing.(http://en.wikipedia.org/wiki/Natural_language_processing)