RT Journal Article SR Electronic T1 Unsupervised Extraction of Epidemic Syndromes from Participatory Influenza Surveillance Self-reported Symptoms JF bioRxiv FD Cold Spring Harbor Laboratory SP 314591 DO 10.1101/314591 A1 Kyriaki Kalimeri A1 Matteo Delfino A1 Ciro Cattuto A1 Daniela Perrotta A1 Vittoria Colizza A1 Caroline Guerrisi A1 Clement Turbelin A1 Jim Duggan A1 John Edmunds A1 Chinelo Obi A1 Richard Pebody A1 Ricardo Mexia A1 Ana Franco A1 Yamir Moreno A1 Sandro Meloni A1 Carl Koppeschaar A1 Charlotte Kjelsø A1 Daniela Paolotti YR 2018 UL http://biorxiv.org/content/early/2018/05/04/314591.abstract AB Seasonal influenza surveillance is usually carried out by sentinel general practitioners who compile weekly reports based on the number of influenza-like illness (ILI) clinical cases observed among visited patients. This practice for surveillance is generally affected by two main issues: i) reports are usually released with a lag of about one week or more, ii) the definition of a case of influenza-like illness based on patients symptoms varies from one surveillance system to the other, i.e. from one country to the other. The availability of novel data streams for disease surveillance can alleviate these issues; in this paper, we employed data from Influenzanet, a participatory web-based surveillance project which collects symptoms directly from the general population in real time. We developed an unsupervised probabilistic framework that combines time series analysis of symptoms counts and performs an algorithmic detection of groups of symptoms, hereafter called syndromes. Symptoms counts were collected through the participatory web-based surveillance platforms of a consortium called Influenzanet which is found to correlate with Influenza-like illness incidence as detected by sentinel doctors. Our aim is to suggest how web-based surveillance data can provide an epidemiological signal capable of detecting influenza-like illness’ temporal trends without relying on a specific case definition. We evaluated the performance of our framework by showing that the temporal trends of the detected syndromes closely follow the ILI incidence as reported by the traditional surveillance, and consist of combinations of symptoms that are compatible with the ILI definition. The proposed framework was able to predict quite accurately the ILI trend of the forthcoming influenza season based only on the available information of the previous years. Moreover, we assessed the generalisability of the approach by evaluating its potentials for the detection of gastrointestinal syndromes. We evaluated the approach against the traditional surveillance data and despite the limited amount of data, the gastrointestinal trend was successfully detected. The result is a real-time flexible surveillance and prediction tool that is not constrained by any disease case definition.Author summary This study suggests how web-based surveillance data can provide an epidemiological signal capable of detecting influenza-like illness’ temporal trends without relying on a specific case definition. The proposed framework was able to predict quite accurately the ILI trend of the forthcoming influenza season based only on the available information of the previous years. Moreover, we assessed the generalisability of the approach by evaluating its potentials for the detection of gastrointestinal syndromes. We evaluated the approach against the traditional surveillance data and despite the limited amount of data, the gastrointestinal trend was successfully detected. The result is a real-time flexible surveillance and prediction tool that is not constrained by any disease case definition.