, , , ,

These days Machine Learning and BigData techniques are well spreaded in DH purposes: emotional analysis of texts, topic modeling, etc. In this post I would like to highlight the main direction of development of the DH tools: the way of “replacement” people by machines in order to give more time and resources to people to develop machines that will “replace” people…

From human to machine

Let’s start with a very basic example of tool that liberate people from doing monotonic work: Topic Modeling [1]. It is a technique coming from Big Data and is “commonly used to classify documents or to cluster tagged artifacts”. Such a technique is very helpful when someone has a large amount of textual documents and needs to easily find something corresponding to a given topic/subject. It is an obvious example of reassignment of work: something that has been done before by people, is now automated and is done by machine in a more structural, formalized way.

Topic Modeling

But when we formalize some processes as topic modeling or information search, we loose the notion of human intuition, and the final result can be correct from the formal point of view, but wrong for the human perception. As discussed in Martin Kim and Quaan-Haase Anabel paper “Designing the next big thing: randomness versus serendipity in DH tools”[2], when we try to introduce the notion of serendipity in DH tools (the example could be an information research) we face the problem that serendipity can not be defined formally, and it is not obvious to build a system that will give correct results in notion of human expectancies. To solve this problem, we, again, ask our-selves “what does a human being do in order to achieve serendipity while using formalized DH tools?”. One of the most wide spread answers is randomness – a human being behave randomly to find something among results, given by a formal system as an answer to a bad formed query (e.g. human formed, formed by a non-formal system). Another possible way to introduce serendipity in formal system, discussed by Martin Kim and Quaan-Haase Anabel in [2], is a44 use of social connections to provide most relevant results to a user based not only on user’s search query, but also on his personality which is extracted form his social network. So finally we ask our-selves, what a human would do in that situation, and we try to emulate human behaviour in order to liberate users from this part of work.

Another example of liberation humans from something that was thought to be completely human is “Digital happiness initiatives”[3]. When Big Data and Machine Learning techniques become more and more important in human behaviour studies and psychology, the idea of controlling people behaviour comes up. The Digital happiness is a sort of such attempts: the question is “Is it possible to understand what makes a human being happy and how can we make a human being happy artificially?”. Basically, we are trying to liberate people from looking for happiness and provide it to them automatically. We are constructing machines which are supposed to understand the notion of happiness for everybody and then provide it to the user. And again, the source of this idea is a willingness of people’s liberation from some parts of their everyday routine in order to give them more time and resources to work on other things.

Digital Happiness

The question that need to be asked at that moment has already been discussed in a paper called “Unhappy? There’s an App for That: Digital Happiness, Data Mining, and Networks of Well-Being”[3]: what is the barrier that should not be crossed and where the human being ends and machine begins?


[1] Big Data and the Literary Archive: Topic Modeling the Watson-McLuhan CorrespondenceQuamen, Harvey; Hjartarson, Paul

[2]Designing the next big thing: Randomness versus serendipity in DH toolsMartin, Kim; Quan-Haase, Anabel

[3]Unhappy? There’s an App for That: Digital Happiness, Data Mining, and Networks of Well-BeingBelli, Jill