Differential Analysis of Age, Gender, Race, Sentiment, and Emotion in Substance Use Discourse on Twitter during the COVID-19 Pandemic: An NLP Approach
Published in JMIR (Journal of Medical Internet Research) 2024, 2024
Objective:
Our study aims to analyze Substance Use trends in User level across different demographic dimensions; such as Age, Gender and Race/Ethnicity, focusing on COVID-19 pandemic. The study also establishes a baseline for substance use trends using social media data.
Methods:
The study is carried out in large scale Twitter data in the English language over a 3 year period; 2019, 2020 and 2021, which comprises 1.05 billions of posts. Following preprocessing, the substance use posts were identified using our custom trained deep learning model (RoBERTa) that resulted in identification of 9 million Substance Use posts. Then, demographic attributes like User Type, Age, Gender, Race/Ethnicity, and Sentiment types, and emotions associated with each post were extracted via a collection of natural language processing modules. Finally, various qualitative analyses were performed to get the insight of user behaviors based on the demographics.
Results:
The highest level of usership in SU discussions was observed in 2020, with increases of 22.18% compared to 2019 and 25.24% compared to 2021. Throughout the study period, Male and Teenagers increasingly dominated the Substance Use discussions in all substances. During the pandemic, Prescription Medication among Female usership was observed high compared to other substances. Additionally, Alcohol usership increased by 80% within two weeks after the Global Pandemic declaration in 2020.
Conclusions:
Our study presents a large-scale, fine-grained analysis of Substance Use on social media data by age, gender and race/ethnicity before, during, and after COVID-19 pandemic. Overall, our analysis from social media data provides a new baseline study for substance usage that can help in prevention of substance use in an efficient manner.
Recommended citation: Maharjan J, Jin R, King J, Zhu J, Kenne D Differential Analysis of Age, Gender, Race, Sentiment, and Emotion in Substance Use Discourse on Twitter during the COVID-19 Pandemic: An NLP Approach JMIR Preprints. 08/10/2024:67333
Download Paper | Download Slides