Home // AIHealth 2025, The Second International Conference on AI-Health // View article
Authors:
Evan Dan
Jianfeng Zhu
Ruoming JIn
Keywords: Suicide; The Llama 3-8b; Mistral-7b; GPT-4o; Reddit; BERTopic Modeling; contributing factors
Abstract:
Suicide remains a critical global health issue, with over 700,000 lives lost annually. Existing research has explored factors influencing suicidal thoughts, but traditional studies often rely on small-scale data sources that may overlook contextual influences. This study aims to address that gap by analyzing a large dataset of posts from Reddit communities r/SuicideWatch and r/Teenagers to detect suicidal ideation and identify associated themes. Using Natural Language Processing and statistical methodologies, including Llama 3-8b and Mistral-7b, we fine-tuned models with manually labeled data to improve classification accuracy of posts for suicidal ideation. Using data re-labeled by the large language models, BERTopic identified key themes linked to suicidal ideation: relationship struggles, academic stress, and family trauma. While non-suicidal posts also included social and academic concerns, the topics were centered around more immediate stressors rather than the long-term emotional distress issues seen in the suicidal group. These findings highlight the potential of NLP methodologies in analyzing large-scale social media data, offering valuable insights for informing new prevention strategies. Additionally, social media, in combination with NLP, serves as a valuable outlet for capturing genuine emotional struggles, enabling more timely and personalized mental health support compared to traditional approaches like counseling.
Pages: 23 to 28
Copyright: Copyright (c) IARIA, 2025
Publication date: March 9, 2025
Published in: conference
ISBN: 978-1-68558-247-0
Location: Lisbon, Portugal
Dates: from March 9, 2025 to March 13, 2025