Robustness of emotion extraction from 20th century English books

Oct 9, 2013·

Alberto Acerbi

Vasileios Lampos

R. Alexander Bentley

· 0 min read

PDF Cite Link

Abstract

In this paper, we test the robustness of emotion extraction from English language books published in the 20th century. Our analysis is performed on a sample of the 8 million digitized books available in the Google Books Ngram corpus by applying three independent emotion detection tools: WordNet Affect, Linguistic Inquiry and Word Count, and a recently proposed ‘Hedonometer’ method. We also assess the statistical robustness of the extracted patterns as well as their outputs on specific parts of speech. The analysis confirms three main results: the existence of recognizable periods of positive and negative ‘literary affect’ from 1900 to 2000, a general decrease in the usage of emotion-related words in printed books that lasts at least until the 1980s, and, finally, a divergence between American and British books, with the former using more emotion-related words from the 1960s.

Type

Conference paper

Publication

Acerbi A., Lampos V., Bentley R. A. (2013), Robustness of emotion extraction from 20th century English books, in IEEE BigData 2013 Proceedings, pp. 1 – 8

Last updated on Oct 9, 2013

Quantitative Analysis of Large Scale Cultural Data

Authors

Alberto Acerbi

Associate Professor

← Regulatory traits: Cultural Influences on Cultural Evolution Jan 1, 2014

Fashion vs. Function in Cultural Evolution: The Case of Dog Breed Popularity Sep 11, 2013 →