Aggregating Search Logs and Self-esteem for Identifying Individuals with History of Suicide Ideation
Machine Learning –> Bayesian Inference; Mental Health.
We propose a framework for identifying individuals with a history of suicide ideation from their daily online search logs. A total of $1,043,855$ searches were collected from $120$ college students, with an average of $10926$ searches per participant, spanning over $4.21$ years of search history. The participants also filled out a gold-standard mental health assessment survey.
To detect the history of suicide ideation, we first examined both semantic search content category features and temporal recurrence and frequency features. Using those features, we develop a novel encoding that projects these aspects of search behaviors into a low dimensional vector space. Our supervised model incorporates both in-the-moment online behaviors and the ground truth self-esteem for identifying individuals with a history of suicide ideation and verifies the hypothesis that including self-esteem during modeling improves the suicide ideation detection task.
Labels for the training and testing sets were drawn from professional mental health surveys. Our highest-performing graphical model achieved a classification F1 score of $0.83$ with an AUC of $0.81$ on a stratified test set population. Our framework of leveraging passively sensed search data could help caregivers and practitioners to identify individuals with a potential history of suicide ideation and extend help.
My Contribution: I was the third author of this paper. I first proposed a spatio-temporal representation for Google Search histories and self-esteem scores. I then designed a graphical model optimized by MCMC methods that incorporates the latent interdependence between self-esteem, cognitive states, and online engagements to identify past suicidal ideation.