Research Article: TB DEPOT (Data Exploration Portal): A multi-domain tuberculosis data analysis resource

Date Published: May 23, 2019

Publisher: Public Library of Science

Author(s): Andrei Gabrielian, Eric Engle, Michael Harris, Kurt Wollenberg, Octavio Juarez-Espinosa, Alexander Glogowski, Alyssa Long, Lisa Patti, Darrell E. Hurt, Alex Rosenthal, Mike Tartakovsky, Feng Gao.


The NIAID TB Portals Program (TBPP) established a unique and growing database repository of socioeconomic, geographic, clinical, laboratory, radiological, and genomic data from patient cases of drug-resistant tuberculosis (DR-TB). Currently, there are 2,428 total cases from nine country sites (Azerbaijan, Belarus, Moldova, Georgia, Romania, China, India, Kazakhstan, and South Africa), 1,611 (66%) of which are multidrug- or extensively-drug resistant and 1,185 (49%), 863 (36%), and 952 (39%) of which contain X-ray, computed tomography (CT) scan, and genomic data, respectively. We introduce the Data Exploration Portal (TB DEPOT, to visualize and analyze these multi-domain data. The TB DEPOT leverages the TBPP integration of clinical, socioeconomic, genomic, and imaging data into standardized formats and enables user-driven, repeatable, and reproducible analyses. It furthers the TBPP goals to provide a web-enabled analytics platform to countries with a high burden of multidrug-resistant TB (MDR-TB) but limited IT resources and inaccessible data, and enables the reusability of data, in conformity with the NIH’s Findable, Accessible, Interoperable, and Reusable (FAIR) principles. TB DEPOT provides access to “analysis-ready” data and the ability to generate and test complex clinically-oriented hypotheses instantaneously with minimal statistical background and data processing skills. TB DEPOT is also promising for enhancing medical training and furnishing well annotated, hard to find, MDR-TB patient cases. TB DEPOT, as part of TBPP, further fosters collaborative research efforts to better understand drug-resistant tuberculosis and aid in the development of novel diagnostics and personalized treatment regimens.

Partial Text

Tuberculosis (TB) is a major challenge for scientists, clinicians, and public health professionals alike. An estimated one-third of the world’s population is carrying latent tuberculosis (TB) [1], and as one of the top ten causes of death worldwide, TB was responsible for 1.6 million deaths in 2017 [2]. Although the disease is curable, treatment involves adherence to six months of a multi-drug regimen.

The TB Portals Program’s introduction of the Data Exploration Portal successfully fulfills the need for a multi-domain hypothesis testing and analysis engine. It furthers the TBPP goal to provide a complete web-enabled ecosystem of resources, critical for countries with a high burden of TB but with limited IT resources. TB DEPOT effectively leverages the TBPP integration of clinical, socioeconomic, imaging, and genomic data. It expands the ability to search and visualize aggregate user-selected groups of patient cases. Experts from different areas can explore, create, search for, and share defined cohorts from a large and growing body of data collected by TBPP.

As demonstrated, TB DEPOT, as part of the TB Portals Program, further fosters collaborative research and clinical efforts to better understand DR-TB and aids in the development of novel diagnostics and personalized treatment regimens. The utility and practical applications of TB DEPOT’s querying and analysis functions are diverse, supporting a range of clinical and research applications for a variety of end users. While developed specifically for the TB Portals’ growing multi-domain TB dataset, it could also be applied for use in other medical research fields. In turn, TB DEPOT can continue to expand through exchange with other bio- and clinical informatics disciplines, incorporating additional functionality and ultimately resulting in a more innovative, interconnected ecosystem with novel applications for research and precision medicine.




Leave a Reply

Your email address will not be published.