Nicolas Pröllochs

Tenure-Track Professor of Data Science at JLU Giessen

Menu
  • About
  • Publications
  • Resources
Menu

About

I am a Tenure-Track Professor of Data Science at the Faculty of Economics and Business Studies of the University of Giessen. Before joining the University of Giessen, I worked as a postdoctoral researcher in machine learning at the University of Oxford. Prior to that, I headed my own research group at the University of Freiburg, where I also obtained my Ph.D. in Information Systems. My research focuses on data science methods and computational techniques for understanding and predicting human decision-making in the digital age. Current research projects apply machine learning and natural language processing to a broad selection of topics, including (1) social networks, (2) recommender systems, and (3) financial markets. Apart from academic research, I am a passionate programmer and have developed multiple widely used R packages (­> 150,000 downloads via CRAN) for text mining and machine learning.

 

Featured Research

Negativity Drives Online News Consumption

Online media is important for society in informing and shaping opinions, hence raising the question of what drives online news consumption. Here, we analyze the causal effect of negative and emotional words on news consumption using a large online dataset of viral news stories. Specifically, we conducted our analyses using a series of randomized controlled trials (N = 22,743). Our dataset comprises ∼105,000 different variations of news stories from Upworthy.com that generated ∼5.7 million clicks across more than 370 million overall impressions. Although positive words were slightly more prevalent than negative words, we found that negative words in news headlines increased consumption rates (and positive words decreased consumption rates). For a headline of average length, each additional negative word increased the click-through rate by 2.3% Our results contribute to a better understanding of why users engage with online media.

Co-authored with Claire E. Robertson (NYU), Kaoru Schwarzenegger (ETH Zurich), Phillip Parnamets (Karolinska Institutet), Jay J. Van Bavel (NYU), Stefan Feuerriegel (LMU Munich)

Accepted at Nature Human Behaviour (preprint available here)


Moralized Language Predicts Hate Speech on Social Media

Hate speech on social media threatens the mental health of its victims and poses severe safety risks to modern societies. Yet, the mechanisms underlying its proliferation, though critical, have remained largely unresolved. In this work, we hypothesize that moralized language predicts the proliferation of hate speech on social media. To test this hypothesis, we collected three datasets consisting of N = 691,234 social media posts and 35.5 million corresponding replies from Twitter that have been authored by societal leaders across three domains (politics, news media, and activism). Subsequently, we used textual analysis and machine learning to analyze whether moralized language carried in source tweets is linked to differences in the prevalence of hate speech in the corresponding replies. Across all three datasets, we consistently observed that higher frequencies of moral and moral-emotional words predict a higher likelihood of receiving hate speech. These results shed new light on the antecedents of hate speech and may help to inform measures to curb its spread on social media.

Accepted at PNAS Nexus (available here)


Community-Based Fact-Checking on Twitter’s Birdwatch Platform

Twitter has recently introduced “Birdwatch,” a community-driven approach to address misinformation on Twitter. In this work, we empirically analyze how users interact with this new feature. Our empirical analysis yields the following main findings: (i) users more frequently file Birdwatch notes for misleading than not misleading tweets. These misleading tweets are primarily reported because of factual errors, lack of important context, or because they contain unverified claims. (ii) Birdwatch notes are more helpful to other users if they link to trustworthy sources and if they embed a more positive sentiment. (iii) The helpfulness of Birdwatch notes depends on the social influence of the author of the fact-checked tweet. For influential users with many followers, Birdwatch notes yield a lower level of consensus among users and community-created fact checks are more likely to be seen as being incorrect. Altogether, our findings can help social media platforms to formulate guidelines for users on how to write more helpful fact checks. At the same time, our analysis suggests that community-based fact-checking faces challenges regarding biased views and polarization among the user base.

Accepted at ICWSM (available here)

 

New Teaching Materials

Slides: Exploratory Text Analysis in R

This slide deck presents an introduction to explanatory text analysis in R. The main learning goals are:

  • Exploratory text analysis: Learn how to gain an initial understanding of text data
  • Tidy text analysis: Learn how to perform text analysis in a “tidy” way using tidytext
  • Corpus analyis: Understand how to explore text corpora and perform tf-idf document weighting in R

The slides can be downloaded here.


Slides: Tidy Data Manipulation in R

This slide deck presents an introduction to tidy data manipulation in R. The main learning goals are:

  • Tidy data manipulation: Learn how to manipulate data using the “dplyr” R-package
  • Pipe operator: Learn how increase code readability using pipes
  • Joins: Learn how to efficiently join separate datasets in R

The slides can be downloaded here.

Nicolas Pröllochs

Connect

  • goodreads
  • github
  • mail

Publications

A list of publications can be found here

R-Packages

R-packages (> 150,000 downloads) can be found here

Datasets

Datasets and further resources can be found here

Teaching Materials

Teaching materials can be found here
©2023 Nicolas Pröllochs