, , ,

« Art is long and life is short and it’s not getting better. » A definitely true definition of life, isn’t it ?
With the success of digitalization of any content like books, reports and many others,  we have the right to ask ourselves what can and should we do with all these books ? Have a look at the huges Google Books and all the digital publisher’s library, obviously there is no time to read them all.  A nice and simple answer then would be: « Don’t read them ! ».

Instead of reading thousand of books and waist a lot of time, you can analyse them and get their main content by using « distant reading».
The term comes from Franco Moretti, an Italian literary scholar and the founder of the Stanford Literary Lab, and he proposes instead of reading books, to get their content by analysing a huge amount of them. It’s known that nowadays, computer can already separate and classified books by genre, places or even the plots proposed, giving you suggestions of what book to read if you really like a specific one. His point of view is that even by reading a lot of books, let’s say a hundred of a specific period, we still can’t get an overall view of the literature of this period, it’s essence and how it evolved because maybe 10.000 books were published at that time. We then  hardly have 1% of the whole overview and we, as human, can’t get the whole picture by ourselves. We must then stop reading books.

This is an interesting point of view and an even more hot debate in the world today since distant reading is still more a theoretical field than a developed and approved method. This was quite a topic discuss by many speakers at the Digital Huanity conference of Hamburg. You could have had for instance an introduction to the field or get a complete conference about it:« The Myth of the New: Mass Digitization, Distant Reading and the Future of the Book » by Gooding Paul Matthew, Warwick Claire and Terras Melissa.

What is interesting is that distant reading is more about data analysis. Because a human and a machine can both easily identify the gender of a book but they don’t base themselves on the sames arguments and points to reach the same answer. When one base itself on the frequency of the words, the other is sensible to the human interaction and sentiments between the characters. It  became a whole paradox because we need to harmonise how to « read » the book when no one want to do them anymore… And then we fall into qualitative analysis that is not anymore a question of quantification.

Many tools are developed and many researchs around the world are lead in this field.  A nice work is building others representations of a book by extracting a relation graph between all the characters of a novel helping the understanding not only of the story of the book but also the whole world and period that is describe through it.

Illustration by Joon Mo Kang of the tragedy « Hamlet » (Source: Stanford Literary Lab)

Distant reading does not only apply to books but can also be used with social networks and news systems like tweeter and newspaper giving us a brand new field of reseach. Distant reading is definitely a huge unknown field and surely will take more importance in the future.