Validating models for disease detection using twitter

Bodnar, Todd; Salathé, Marcel

doi:10.1145/2487788.2488027

2013

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Data mining social media has become a valuable resource for infectious disease surveillance. However, there are considerable risks associated with incorrectly predicting an epidemic. The large amount of social media data combined with the small amount of ground truth data and the general dynamics of infectious diseases present unique challenges when evaluating model performance. In this paper, we look at several methods that have been used to assess influenza prevalence using Twitter. We then validate them with tests that are designed to avoid and illustrate issues with the standard k-fold cross validation method. We also find that small modifications to the way that data are partitioned can have major effects on a model's reported performance

Details

Title Validating models for disease detection using twitter

Author(s) Bodnar, Todd ; Salathé, Marcel

Published in Proceedings of the 22nd International World Wide Web Conference (WWW 2013)

Pages 699-702

Conference 22nd International World Wide Web Conference (WWW 2013), Rio de Janeiro, Brazil, May 13 - 17, 2013

Date 2013

Publisher Geneva, International World Wide Web Conferences Steering Committee

ISBN 978-1-4503-2038-2

DOI https://doi.org/10.1145/2487788.2488027

Laboratories UPSALATHE1

Record Appears in Scientific production and competences > SV - School of Life Sciences > GHI - Global Health Institute > UPSALATHE1 - Prof. Salathé Group (SV/IC)
Work outside EPFL
Conference Papers
Published

Record creation date 2015-12-10

Abstract

Details

Actions