Research Article: Overcoming the pitfalls of automatic interpretation of whole genome sequencing data by online tools for the prediction of pyrazinamide resistance in Mycobacterium tuberculosis

Date Published: February 28, 2019

Publisher: Public Library of Science

Author(s): Tomotada Iwamoto, Yoshiro Murase, Shiomi Yoshida, Akio Aono, Makoto Kuroda, Tsuyoshi Sekizuka, Akifumi Yamashita, Kengo Kato, Takemasa Takii, Kentaro Arikawa, Seiya Kato, Satoshi Mitarai, W.C. Yam.


Automated online software tools that analyse whole genome sequencing (WGS) data without the need for bioinformatics expertise can motivate the implementation of WGS-based molecular drug susceptibility testing (DST) in routine diagnostic settings for tuberculosis (TB). Pyrazinamide (PZA) is a key drug for current and future TB treatment regimens; however, it was reported that predictive power for PZA resistance by the available tools is low. Therefore, this low predictive power may make users hesitant to use the tools. This study aimed to elucidate why and to uncover the real performance of the tools when taking into account their variation calling lists (manual inspection), not just their automated reporting system (default setting) that was evaluated by previous studies.

WGS data from 191 datasets comprising 108 PZA-resistant and 83 susceptible strains were used to evaluate the potential performance of the available online tools (TB Profiler, TGS-TB, PhyResSE, and CASTB) for predicting phenotypic PZA resistance.

When taking into consideration the variation calling lists, 73 variants in total (47 non-synonymous mutations and 26 indels) in pncA were detected by TGS-TB and PhyResSE, covering all mutations for the 108 PZA-resistant strains. The 73 variants were confirmed by Sanger sequencing. TB Profiler also detected all but three complete loss, two large deletion at the 3’-end, and one relatively large insertion of pncA. On the other hand, many of the 73 variants were lacking in the automated reporting systems except by TGS-TB; of these variants, CASTB detected only 20. By applying the ‘non-wild type sequence’ approach for predicting PZA resistance, accuracy of the results significantly improved compared with that of the automated results obtained by each tool.

Users can obtain more accurate predictions for PZA resistance than previously reported by manually checking the results and applying the ‘non-wild type sequence’ approach.

Partial Text

Mycobacterium tuberculosis (MTB) is a slow-growing bacillus and thus requires several weeks to months for phenotypic drug-susceptibility testing (DST). The essential role and ubiquitous use of pyrazinamide (PZA) in tuberculosis (TB) therapy underlines the importance of DST against PZA [1,2]. However, the test for PZA is more complicated and less reliable than those for other anti-TB drugs due to the difficulty of growing TB under the acidic (pH < 6.0) conditions required for the assay [3–5]. As the current phenotypic test is difficult and time consuming, clinical laboratories are eager for a more accurate and rapid assay to detect PZA resistance. Genotypic DST is an ideal approach for bypassing the difficulty, inaccuracy, and lengthy process of phenotypic DST, providing rapid results [6,7]. However, genotypic DST for PZA is extremely challenging because the pncA mutations that induce loss of pyrazinamidase (PZase) activity (the major mechanism leading to PZA resistance) are highly diverse and scattered throughout the pncA gene [8,9]. Our analysis of online tool performance in detecting PZA resistance demonstrated that the tools can predict resistance with more sensitivity than previously reported when all detected variants (non-synonymous mutations and indels) are taken into account (Table 5). In fact, all or most mutations in the pncA gene of 108 PZA-resistant strains were detected by TB Profiler, TGS-TB, and PhyResSE but many variants were missing in the automated interpretation for plain text report. As a result, the accuracy of PZA resistance prediction by online tools differed substantially between the default report and manual inspection (Table 5). Thus, the fundamental cause for the reported low sensitivity of PZA resistance prediction is the tool interpretation pipeline, which is built upon pre-defined mutation catalogues and not due to the inability of software algorithms in detecting genetic variants.   Source:


Leave a Reply

Your email address will not be published.