“Ich bin ein Berliner”, this famous statement from US president JFK is well known because this is a double meaning in the German language. It is well known that some languages have more complex structures and are that way less easy to learn. The political speeches are actually well recorded, analyzed and stored.
In that sense, a German institute called “Archiv für Gesprochenes Deutsch” (AGD) http://agd.ids-mannheim.de/index.shtml is working on storing historical audio archives. These audio archives are stored in a library called the” Datenbank für Gesprochenes Deutsch “ (DGD) http://dgd.ids-mannheim.de, accessible from the internet. The main goal of this storage is to keep an historical trace on languages variations between different countries but also in the different states or regions from a country. These dialects are meaningfull in the cultural history of these regions because they represent a lot for their inhabitants.
But how could these audio libraries be compatible with our (internet) search engines? How is it possible to find and explore such data without waisting enormous time? This isn’t like a book you can just overlook and get some of its content.
In this that the German language is very sensitive to the pronunciation, I discover a project called EXMARaLDA, project in the German DH actualities which is working on this question.
EXMARaLDA is an acronym of “Extensible Markup Language for Discourse Annotation”. It is a system of concepts, data formats and tools for the computer assisted transcription and annotation of spoken language, and for the construction and analysis of spoken language corpora.
Insert a digital structure to an audio file to get a better description, but also a better restitution of the content of it. This does also mean that the audio file can be easily quoted, translated and classified, included all the observations for the language statistics, words, sentences or other stuff which are well used in politics.
Another goal of EXMARaLDA is to make easier the corrections or verifications of the transcription, providing request possibilities during the transcription.
EXMARaLDA unterstützt einen korpus-gestützten, dynamischen Arbeitsablauf, indem es dem Benutzer die Möglichkeit gibt, Suchabfragen bereits während der Transkription zu tätigen, die Genauigkeit existierender Transkriptionen während der Analyse zu überprüfen (z.B. von einem konkreten Abfrageergebnis direkt in das editierbare Transkript zu gelangen) und sie falls nötig zu korrigieren bzw. zu ergänzen.
Indeed, using a structure like that, the browsers and queries are much easier to process. That tend to give also the possibility to language statisticians to explore in an easy way the data like in a data base. Finally this could be used to “enhance” the content of these audio archives and to give them some “google’ability”.
The german community for Digital Humanities is very active. A lot of projects linking history, language and technology are described on their website. The last big Event was the international conference of the Alliance of Digital Humanities Organizations which took place in the university of hamburg in july 2012. http://www.dh2012.uni-hamburg.de/