text mining

Why people die in novels: Testing the ordeal simulation hypothesis

What is fiction about, and what is it good for? An influential family of theories sees fiction as rooted in adaptive simulation mechanisms. In this view, our propensity to create and enjoy narrative fictions was selected and maintained due to the …

The impact of the “World’s 25 Most Endangered Primates” list on scientific publications and media

Assessing the impact of conservation campaigns is of high importance for optimizing the use of limited resources. Lists of threatened species are often used as media outreach tools, but their usefulness is rarely tested. We investigated whether the …

Cultural evolution of emotional expression in 50 years of song lyrics

The cultural dynamics of music has recently become a popular avenue of research in the field of cultural evolution, reflecting a growing interest in art and popular culture more generally. Just as biologists seek to explain population-level trends in …

Role of Neutral evolution in word turnover during centuries of English word popularity

Here, we test Neutral models against the evolution of English word frequency and vocabulary at the corpus scale, as recorded in annual word frequencies from three centuries of English lan- guage books. Against these data, we test both static and …

Birth of the cool: a two-centuries decline in emotional expression in Anglophone fiction

The presence of emotional words and content in stories has been shown to enhance a story’s memorability, and its cultural success. Yet, recent cultural trends run in the opposite direction. Using the Google Books corpus, coupled with two …

Books average previous decade of economic misery

For the 20th century since the Depression, we find a strong correlation between a ‘literary misery index’ derived from English language books and a moving average of the previous decade of the annual U.S. economic misery index, which is the sum of …

Robustness of emotion extraction from 20th century English books

In this paper, we test the robustness of emotion extraction from English language books published in the 20th century. Our analysis is performed on a sample of the 8 million digitized books available in the Google Books Ngram corpus by applying three …

The expression of emotions in 20th century books

We report here trends in the usage of “mood” words, that is, words carrying emotional content, in 20th century English language books, using the data set provided by Google that includes word frequencies in roughly 4% of all books published up to the …