Status updates may be more useful than demographic factors in predicting certain health conditions. That’s the finding from a new study published in PLOS One.
The authors linked words used in patients’ Facebook posts to their electronic medical records (EMR) in order to see if the ways people express themselves on social media gives clues about their health.
For 10 conditions—including several behavioral/mental health disorders, but also conditions such as diabetes and lung disease (COPD)—clusters of used words in status updates more accurately predicted a clinical diagnosis than did a person’s age, sex and race. For 18 conditions, the combination of demographics and Facebook posts gave better information overall than looking at either factor separately.
An interesting aspect of this study is that the most predictive words did not always describe illness symptoms. For example, we might expect that the combination of words “stomach” and “hurt” signals a higher likelihood that a person has digestive distress (it does).
But people who used seemingly unrelated social cues such as family, religiosity, joy and gratitude were more likely to have diseases such as diabetes—more so than if they used combinations of words such as "hospital" or "surgery".
This demonstrates the power of using machine learning to identify meaningful patterns among large datasets. But sifting through social media posts may have more limited value for health insurance carriers or plan sponsors (i.e., employers). These organizations already have access to EMR or claims data—which can probably tell them more about future health care utilization than can social media or demographics such as sex, race, and age. In this case, machine learning from social media activity seems like a hammer in search of a nail.
But used with other demographic and geographic data (such as zip code), the approach could provide more context to social determinants of health. From a population standpoint—ignoring for now the specter of companies gathering private health information from individuals’ online presence—posts that signal high levels of social isolation, despair, financial worries or concern for loved ones could help inform the need for mental health and care management services. If nothing else, they could give clues for effective public health communication strategies that raise awareness about clinical, social, mental health and self-care resources.