Dengue fever: from extreme climates to outbreak prediction

Son T. Mai, Ha T. Phi, Abdullahi Abubakar, Peter Kilpatrick, Hung Q. V. Nguyen, Hans Vandierendonck

Research output: Chapter in Book/Report/Conference proceedingConference contribution

118 Downloads (Pure)


Dengue Fever (DF) is an emerging mosquito-borne infectious disease that affect hundred millions of people each year with considerable morbidity and mortality rates, especial on children. Together with global climate changes, it is continuously increasing in terms of number of cases and new locations. Thus, having effective early warning systems become an urgent need to improve disease controls and prevention. In this paper, we introduce a novel framework, called Proximity Time Ensemble, to predict DF outbreaks for multiple areas (provinces) and multiple time step ahead, and to study the effects of climate data on DF outbreaks. PT-Ensem consists of 6 key components: (1) an event-to-event probabilistic framework to study links among extreme climate events and DF outbreaks; (2) a proximity graph that connects similar provinces; (3) an ensemble prediction technique that combines many different advanced machine learning (ML) methods to predict outbreaks within $t$ time steps in the future using extreme climate events as model inputs; (4) a data aggregate scheme to enrich training data for each provinces via its neighbors in the proximity graph; (5) a proximity propagation step that propagates predicted results among similar provinces via the proximity graph until maximal agreements are reached among provinces; and (6) a time propagation step to propagate results via different predicted time steps in each province. We use PT-Ensem to predict DF outbreaks for all provinces in Vietnam using data collected from 1997-2016. Experiments show that PT-Ensem acquires significant performance boost compared to many highly-rated ML models like XGBoost, LightGBM and Catboost in the outbreak prediction task. Compared to most recent deep learning approaches like LSTM-ATT, LSTM, CNN and Transformer for predicting DF incidence, PT-Ensem also dominates in both prediction accuracy and computation times.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Data Mining (ICDM): proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665450997
ISBN (Print)9781665451000
Publication statusPublished - 01 Feb 2023
Event22nd IEEE International Conference on Data Mining - Orlando, United States
Duration: 28 Nov 202201 Dec 2022
Conference number: 22

Publication series

NameIEEE International Conference on Data Mining (ICDM)
ISSN (Print)1550-4786
ISSN (Electronic)2374-8486


Conference22nd IEEE International Conference on Data Mining
Abbreviated titleICDM
Country/TerritoryUnited States
Internet address


Dive into the research topics of 'Dengue fever: from extreme climates to outbreak prediction'. Together they form a unique fingerprint.

Cite this