Research Article: PhyloPi: An affordable, purpose built phylogenetic pipeline for the HIV drug resistance testing facility

Date Published: March 5, 2019

Publisher: Public Library of Science

Author(s): Phillip Armand Bester, Andrie De Vries, Stephanus Riekert, Kim Steegen, Gert van Zyl, Dominique Goedhals, Chiyu Zhang.


Phylogenetic analysis plays a crucial role in quality control in the HIV drug resistance testing laboratory. If previous patient sequence data is available sample swaps can be detected and investigated. As Antiretroviral treatment coverage is increasing in many developing countries, so is the need for HIV drug resistance testing. In countries with multiple languages, transcription errors are easily made with patient identifiers. Here a self-contained blastn integrated phylogenetic pipeline can be especially useful. Even though our pipeline can run on any unix based system, a Raspberry Pi 3 is used here as a very affordable and integrated solution.

The computational capability of this single board computer is demonstrated as well as the utility thereof in the HIV drug resistance laboratory. Benchmarking analysis against a large public database shows excellent time performance with minimal user intervention. This pipeline also contains utilities to find previous sequences as well as phylogenetic analysis and a graphical sequence mapping utility against the pol area of the HIV HXB2 reference genome. Sequence data from the Los Alamos HIV database was analyzed for inter- and intra-patient diversity and logistic regression was conducted on the calculated genetic distances. These findings show that allowable clustering and genetic distance between viral sequences from different patients is very dependent on subtype as well as the area of the viral genome being analyzed.

The Raspberry Pi image for PhyloPi, source code of the pipeline, sequence data, bash-, python- and R-scripts for the logistic regression, benchmarking as well as helper scripts are available at and The PhyloPi image and the source code are published under the GPLv3 license. A demo version of the PhyloPi pipeline is available at

Partial Text

The use of combined Antiretroviral Therapy (cART) has dramatically decreased the mortality and morbidity of HIV infected people. It was estimated that 12.9 million people were receiving antiretroviral treatment (ART) by the end of 2013 worldwide. The number of HIV infected individuals not receiving ART dropped from 90% in 2006 to 63% in 2013 globally [1]. Since 2010 the number of people living with HIV on antiretroviral therapy increased from 7.5 million to 17 million in 2015, whereas people infected with HIV only increased from 33.3 million to 36.7 million in the same time period [2]. However, inadequate adherence on ART, especially when low genetic barrier regimens such as NNRTI-based combination therapies are prescribed, often results in virologic failure with the emergence of drug resistance. In contrast, in cases with virologic failure of high genetic barrier regimens, such as boosted protease inhibitor- or dolutegravir- containing regimens drug resistance is often absent [2,3]. Drug resistance testing can differentiate patients, with persisting failure, who require adherence support from those who require a regimen switch and to select appropriate third-line regimens [4,5]. It has been demonstrated by numerous investigators that not only is drug resistance on the rise in South Africa and other countries, but that transmitted resistance is also on the increase[2,6–10]. As a result, this will increase the amount of sequence data which will become available. We believe the work done here can help individual drug resistance facilities to cope with the quality assurance requirements this increase will infer.

We have demonstrated that this affordable single board computer is capable of performing phylogenetic inference in an impressive short time and is capable of retrieving the most significant sequences from its automatically maintained blastn database without requiring any additional user intervention. Also, the utility of this self-contained pipeline was shown in our case studies. Additionally, PhyloPi provides a search interface for retrieving sequences by name as well as past analysis stored in a SQL database. PhyloPi also provides a tool with which the user can easily map sequences in a fasta file against HIV HXB2 using blastn. This quickly gives a result showing the areas the sequences cover in the POL region, which can be useful to determine whether input sequences are suitable for alignment.




Leave a Reply

Your email address will not be published.