The desire to amass knowledge is probably the single most important trait that distinguishes human beings to any other creature on the planet. To that end, building a pool of infinite knowledge that would comprise the entire human history is a very tempting idea. It seems that steps are already being taken in this direction, and a trend for building a visual and easily accessible archive with information can be picked up. Three abstracts that lead to this conclusion will be presented in this article.
But first things first. Before building a knowledge base with our known history, we should take into account the large number of date representations used in ancient and more recent documents, so as to be able to correctly time stamp them and arrange them in chronological order.
This abstract presents the problem of encoding historical dates within a project called Map of Early Modern London (MoEML). The problem, of course, arises from the multiple date conventions used across history and also across different geographical regions. One example would be Gregorian versus Julian calendars, between which there is a ten day difference and, starting from the 12th century, there are different dates for the New Year’s Day.
This particular project spans the time of maximum calendar confusion for England. The dates encoded in the project need to be as accurate as possible, as the project also aims to plot event sequences on timelines and compare data with other projects. The problem of encoding these dates however comes in part from the different date representations (Gregorian, Julian, regnal, Anno Mundi), but also from the way in which writers decided to treat this issue. The usage of the terms “Old Style” and “New Style” has been highly abused to mean either the starting point of the New Year’s Day convention, the leap-year convention between Julian and Gregorian calendars, or both. As expected, this causes great difficulties in exactly determining the date of an event.
The encoding process used in the project does not alter the original dates, as this would add unnecessary overhead to the encoders. The standard makes use of the original date together with information that can be used to compute the accurate date or date-range in the Gregorian calendar.
So it seems that steps are taken into the direction of storing and converting historical dates to current dates.
With the large amount of information already recorded and digitized, a good tool would be one that could spot irregularities or errors in the datasets. This abstract not comes up with a solution for this, by using a temporal GIS (http://en.wikipedia.org/wiki/Geographic_information_system) viewer for finding irregularities in historical GIS databases.
The project presented is an effort to improve spatio-temporal GIS by changing the database management system from the popular RDBMS (http://en.wikipedia.org/wiki/Relational_database_management_system) to a linked-list structure using pointers, called ILE (Intentionally-Linked Entities). This approach allows for quick and effortless searches withing the database. Relational databases are poorly suited for temporal applications as the temporal data is represented by time-stamps and thus requiring extra effort for optimizing searches.
In testing this new approach, CHGIS was used (China Historical GIS – http://www.fas.harvard.edu/~chgis/) because of the abundance of data and very good temporal resolution (1 year). After testing two GIS viewers, Google Earth and ESRI ArcGIS Explorer, the authors decided to write their own HGIS viewer, Clearview. Clearview managed to expose several irregularities in the database and, with the help of some additional software developed by the team, detected 124 out of 544 instances where the provincial capitals lied outside of their province’s boundary.
Irregularities may or may not be resoluted, as the information sources may not be available anymore, but it is an important first step to discover such inconsistencies.
Finally, the moment you have been waiting for. The third abstract describes the contributions that Big Data could have on thoroughly documenting our past. The Collaborative for Historical Information and Analysis (CHIA – http://chia.pitt.edu) will not only store data on demographics, social, political, economical, health and more, but also display patterns that emerge from analyzing the different interactions between these levels and to provide visual representations of the results. It is an exciting project with the end goal of providing the common user with an interface to explore and learn from this wast intercultural historical database.
The project is ongoing and is in the process of collecting as much historical data as possible from a variety of sources and creating an infrastructure for gathering it (techniques such as crowd-sourcing – http://en.wikipedia.org/wiki/Crowdsourcing), storing it and eventually analyzing it. The data is also to be “colonized” by other projects, such as CLIO at Boston University and made available for collaboration with scientists all over the world. It is expected that the data resources to grow over time to several terabytes. As such, the project is well suited for the field of Big Data (http://en.wikipedia.org/wiki/Big_data).
One of the key elements of the project is elaborating a platform from where this enormous archive can be accessed via a “faceted search”. This will allow users to fully emerge in the vast amount of information and take full advantage of the world-historical analytic system. The search system will provide users with the ability to filter the desired data by space, time and topic. Once the search criteria is entered, the results are presented to the user on a map, with the corresponding studies made available upon click.
We have presented the problem of dates in history, the problem of irregularities in the data sets, the visual representation of temporal data on maps and the big project that aims to bring together all the known historical archives into a single platform. There are tools, methods and standards being developed for overcoming the foreseeable problems that may occur in the endeavor of creating a Digitized World History.
We can conclude that there is a trend set in this direction, and we have the choice of either waiting for the product of this collaborative effort to be made available or joining in on the action!