An iterative topic model filtering framework for short and noisy user-generated data: analyzing conspiracy theories on twitter

2022 | journal article; research paper. A publication with affiliation to the University of Göttingen.

Jump to: Cite & Linked | Documents & Media | Details | Version history

Cite this publication

​An iterative topic model filtering framework for short and noisy user-generated data: analyzing conspiracy theories on twitter​
Kant, G.; Wiebelt, L.; Weisser, C.; Kis-Katos, K. ; Luber, M. & Säfken, B.​ (2022) 
International Journal of Data Science and Analytics,.​ DOI: https://doi.org/10.1007/s41060-022-00321-4 

Documents & Media

s41060-022-00321-4.pdf5.58 MBAdobe PDF

License

Published Version

Attribution 4.0 CC BY 4.0

Details

Authors
Kant, Gillian; Wiebelt, Levin; Weisser, Christoph; Kis-Katos, Krisztina ; Luber, Mattias; Säfken, Benjamin
Abstract
Abstract Conspiracy theories have seen a rise in popularity in recent years. Spreading quickly through social media, their disruptive effect can lead to a biased public view on policy decisions and events. We present a novel approach for LDA-pre-processing called Iterative Filtering to study such phenomena based on Twitter data. In combination with Hashtag Pooling as an additional pre-processing step, we are able to achieve a coherent framing of the discussion and topics of interest, despite of the inherent noisiness and sparseness of Twitter data. Our novel approach enables researchers to gain detailed insights into discourses of interest on Twitter, allowing them to identify tweets iteratively that are related to an investigated topic of interest. As an application, we study the dynamics of conspiracy-related topics on US Twitter during the last four months of 2020, which were dominated by the US-Presidential Elections and Covid-19. We monitor the public discourse in the USA with geo-spatial Twitter data to identify conspiracy-related contents by estimating Latent Dirichlet Allocation (LDA) Topic Models. We find that in this period, usual conspiracy-related topics played a marginal role in comparison with dominating topics, such as the US-Presidential Elections or the general discussions about Covid-19. The main conspiracy theories in this period were the ones linked to “Election Fraud” and the “Covid-19-hoax.” Conspiracy-related keywords tended to appear together with Trump-related words and words related to his presidential campaign.
Issue Date
2022
Journal
International Journal of Data Science and Analytics 
Organization
Campus-Institut Data Science 
ISSN
2364-415X
eISSN
2364-4168
Language
English

Reference

Citations


Social Media