, , , , ,

At the digital humanities conference 2012, Annelen Brunner from the “Institut für deutsche Sprache” wrote about the idea of an automatic recognition of speech, thought and writing presentation (ST&WR) in German narrative text. There are different possibilities to characterize these presentation types in narratology. For example there is direct representation, which includes thoughts like “I am hungry”, free indirect representation which consists of the narrator’s voice like “Where would we get something to eat now?”. Furthermore indirect representation involves expressions like “He said he was hungry.” and a reported presentation includes sentences like “They talked about lunch”. The type of used ST&WR not only varies between authors and genres but also depends on the time something was written.

For this, an automated annotation would be very useful in order to analyze large numbers of different texts. It facilitated the work for a narratologist in view of the fact that he would find differences or regularities of two or more texts much faster. Moreover a narratologist could immediately compare hundreds of texts and would be able to make statements about different authors, genres or time periods. Krestel et al. developed a recognizing tool of direct and indirect speech in order to detect second-hand information in newspapers. This digital method operates on a vocabulary level and does not feature narrative theory.

The automatic ST&WR recognizer is realized in the GATE framework (General Architecture for Text Engineering) and intended to be free for everyone. The project is still ongoing.

Another interesting paper by Marco Büchler from the Leipzig University deals with the issue of only partially preserved, ancient data. Until now, the task of a text has to be reconstructed manually with the help of dictionaries. A scholar needs great knowledge of the historical and cultural background in order to reconstruct damaged text successfully.

To overcome such an issue, machines can be introduced to be able to predict words based on spell checking and text mining algorithms. The process of spell checking and text mining can be divided into two tasks. Firstly, incorrect or incomplete words have to be identified and secondly, an appropriate solution has to be suggested. This solution can be found by the help of several methods. For example the string based approach involves a method that compares words by their word similarity on letter level.

A demonstration of text reconstruction can be seen here.

In a paper by Richard Littauer from the “Universität des Saarlandes”, the authors discuss the advantage and disadvantages of blogging in relevance to academic research. With the help of blogging, a writer reaches a wide audience as well as he gets feedback rapidly. In comparison to peer-reviewed research, interdisciplinary research is more promoted because blogging is thought to be public. New ideas can be presented and discussed in an easier way, while in the conventional way of peer-review research, it takes much longer to reach the interested researchers. However, there are not only positive points about this. With the presence of blogged research, which lets everybody participate, the question of reliability and seriousness arises. Improper handling of research tasks may dilute the relevance of a research.

The challenging questions are: What are the risks of research that is accessible for public? or how reliable is blogged research compared to peer-review research?

Have a look at Replicatedtypo – a blog runned by several scientists with interesting research blog entries.