Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter

Adrover, Cosme; Bodnar, Todd; Huang, Zhuojie; Telenti, Amalio; Salathé, Marcel

doi:10.2196/publichealth.4488

research article

Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter

Adrover, Cosme

•

Bodnar, Todd

•

Huang, Zhuojie

more

2015

JMIR Public Health and Surveillance

Background: Social media platforms are increasingly seen as a source of data on a wide range of health issues. Twitter is of particular interest for public health surveillance because of its public nature. However, the very public nature of social media platforms such as Twitter may act as a barrier to public health surveillance, as people may be reluctant to publicly disclose information about their health. This is of particular concern in the context of diseases that are associated with a certain degree of stigma, such as HIV/AIDS. Objective: The objective of the study is to assess whether adverse effects of HIV drug treatment and associated sentiments can be determined using publicly available data from social media. Methods: We describe a combined approach of machine learning and crowdsourced human assessment to identify adverse effects of HIV drug treatment solely on individual reports posted publicly on Twitter. Starting from a large dataset of 40 million tweets collected over three years, we identify a very small subset (1642; 0.004%) of individual reports describing personal experiences with HIV drug treatment. Results: Despite the small size of the extracted final dataset, the summary representation of adverse effects attributed to specific drugs, or drug combinations, accurately captures well-recognized toxicities. In addition, the data allowed us to discriminate across specific drug compounds, to identify preferred drugs over time, and to capture novel events such as the availability of preexposure prophylaxis. Conclusions: The effect of limited data sharing due to the public nature of the data can be partially offset by the large number of people sharing data in the first place, an observation that may play a key role in digital epidemiology in general.

Name

api-download?filename=78b0e3eb7a59e4b8a8f1f477f20c6fef.pdf&alt_name=4488-57884-2-SP.pdf

Type

Publisher's version

Access type

openaccess

License Condition

CC BY

Size

176.78 KB

Format

Adobe PDF

Checksum (MD5)

113b8908c8c798fd03c4ea8b87f727ca