Exploring Social Media for Healthcare Data

People enjoy sharing information through social media, including healthcare data. Yeah, it is true! And it constitutes the starting point of the research work titled ‘Exploring Spanish health social media for detecting drug effects’, which aims at following social media conversations to identify how people talk about their relation with drug consumption. This allows identifying possible adverse effects previously unknown related to these drugs. Although there is a protocol to communicate to the authorities the identification of a drug adverse effect, only a 5 – 20% of them are reported. Besides, conversations around drugs, symptoms, conditions and diseases can be analyzed to learn more about them. For example, it is possible to see how people search for specific drugs using social media, while others sell them, perhaps illegally. Many others talk about mixing alcohol with drugs or other illegal substances. Of course, one cannot believe everything that appears on the Internet this is another issue—, but it can highlight some hypothesis for further research.


Some researchers from the Advanced Databases Group at Carlos III University of Madrid have carried out the mentioned study, designing hybrid models to capture the needed knowledge to identify adverse effects. The Natural Language Processing platform which supports the implementation of the analysis process based on such models is MeaningCloud. The customization capabilities provided by the platform have been decisive to include specific vocabulary and medical domain knowledge. As we know, the names of drugs and symptoms might be complex and, in some cases, difficult to write properly. The algorithm’s results are promising, with a 10% increase in recall when compared to other known algorithms. You can find further details in the scientific paper published by the BMC Medical Informatics and Decision Making Journal.

These developments have been part of the TrendMiner project, and are now available in the prototype website TrendMiner Health Analytics Dashboard, which shows people’s comments about antidepressants gathered from social media. The console displays the mentions of antidepressants and related symptoms and, by clicking on any of them, their evolution over time. Moreover, the source texts analyzed to compute those mentions are shown at the bottom, with labels highlighting the names of drugs, symptoms or diseases, and any relations among them. Such relations might say if a drug is indicated for a symptom or if a disease is an adverse effect of the mentioned drug. The prototype also allows searching by the ATC code (Anatomical Therapeutic Chemical Classification System) and the corresponding level according to this classification scheme. So, if you mark the ‘By Active Substance’ selector, you are searching any drug containing the active substance of the product you inserted in the search box. Furthermore, the predictive search functionality makes easier to find the right expression for a drug or disease. Please, have a look at the prototype and tell us what you think about it. If you find a chart useful, you can even tweet it from there! Any comment is more than welcome.

About José Luis Martínez

Passionate about business around Natural Language processing application to solve real problems. Structuring unstructured data, even in big data environments. Partner at MeaningCloud.

Leave a Reply

Your email address will not be published. Required fields are marked *