Research Article: Dengue Fever Surveillance in India Using Text Mining in Public Media

Date Published: January 23, 2018

Publisher: The American Society of Tropical Medicine and Hygiene

Author(s): Andrea Villanes, Emily Griffiths, Michael Rappa, Christopher G. Healey.


Despite the improvement in health conditions across the world, communicable diseases remain among the leading mortality causes in many countries. Combating communicable diseases depends on surveillance, preventive measures, outbreak investigation, and the establishment of control mechanisms. Delays in obtaining country-level data of confirmed communicable disease cases, such as dengue fever, are prompting new efforts for short- to medium-term data. News articles highlight dengue infections, and they can reveal how public health messages, expert findings, and uncertainties are communicated to the public. In this article, we analyze dengue news articles in Asian countries, with a focus in India, for each month in 2014. We investigate how the reports cluster together, and uncover how dengue cases, public health messages, and research findings are communicated in the press. Our main contributions are to 1) uncover underlying topics from news articles that discuss dengue in Asian countries in 2014; 2) construct topic evolution graphs through the year; and 3) analyze the life cycle of dengue news articles in India, then relate them to rainfall, monthly reported dengue cases, and the Breteau Index. We show that the five main topics discussed in the newspapers in Asia in 2014 correspond to 1) prevention; 2) reported dengue cases; 3) politics; 4) prevention relative to other diseases; and 5) emergency plans. We identify that rainfall has 0.92 correlation with the reported dengue cases extracted from news articles. Based on our findings, we conclude that the proposed method facilitates the effective discovery of evolutionary dengue themes and patterns.

Partial Text

Communicable diseases remain among the leading mortality causes in many countries, particularly in Asia and Africa.1 In 2010, of the 52.8 million deaths in the world, 24.9% were due to communicable, maternal, neonatal, and nutritional causes. Moreover, 76% of premature mortality in sub-Saharan Africa in 2010 were due to the same causes.1

In this section, we present information about dengue fever, and relevant public health surveillance systems.

In this article, we introduce the use of text mining topic clustering to infer topics from news articles discussing dengue; construct topic evolution graphs; analyze the life cycle of dengue news articles in India, and; relate them to rainfall, monthly reported dengue cases, and the Breteau Index. Our work provides the following novel contributions versus existing approaches: 1) topics extracted from news articles offer not only information on dengue trends in a specific geographic area but also information about other topics extracted from news articles, such as prevention, politics, prevention relative to other diseases, and emergency plans; 2) the evolution of topics throughout the year can be used by dengue experts, health care officials, public health policy makers, communicators, and journalists to obtain insight on relationships to a specific communicable disease; 3) although the rainfall and Breteau Index can be used to detect patterns for dengue, this information may not be promptly available, or may not be collected in a specific region. Our proposed methodology can help close the gap and provide reliable information in a specific region; and 4) although interpretation of the clusters may require human input, our analysis can be automated to reduce the delay in receiving official data, and improve the availability of data needed to decrease morbidity and mortality of communicable diseases.




Leave a Reply

Your email address will not be published.