Date Published: August 13, 2008
Publisher: Public Library of Science
Author(s): Asif M. Khan, Olivo Miotto, Eduardo J. M. Nascimento, K. N. Srinivasan, A. T. Heiny, Guang Lan Zhang, E. T. Marques, Tin Wee Tan, Vladimir Brusic, Jerome Salmon, J. Thomas August, Eva Harris
Abstract: BackgroundGenetic variation and rapid evolution are hallmarks of RNA viruses, the result of high mutation rates in RNA replication and selection of mutants that enhance viral adaptation, including the escape from host immune responses. Variability is uneven across the genome because mutations resulting in a deleterious effect on viral fitness are restricted. RNA viruses are thus marked by protein sites permissive to multiple mutations and sites critical to viral structure-function that are evolutionarily robust and highly conserved. Identification and characterization of the historical dynamics of the conserved sites have relevance to multiple applications, including potential targets for diagnosis, and prophylactic and therapeutic purposes.Methodology/Principal FindingsWe describe a large-scale identification and analysis of evolutionarily highly conserved amino acid sequences of the entire dengue virus (DENV) proteome, with a focus on sequences of 9 amino acids or more, and thus immune-relevant as potential T-cell determinants. DENV protein sequence data were collected from the NCBI Entrez protein database in 2005 (9,512 sequences) and again in 2007 (12,404 sequences). Forty-four (44) sequences (pan-DENV sequences), mainly those of nonstructural proteins and representing ∼15% of the DENV polyprotein length, were identical in 80% or more of all recorded DENV sequences. Of these 44 sequences, 34 (∼77%) were present in ≥95% of sequences of each DENV type, and 27 (∼61%) were conserved in other Flaviviruses. The frequencies of variants of the pan-DENV sequences were low (0 to ∼5%), as compared to variant frequencies of ∼60 to ∼85% in the non pan-DENV sequence regions. We further showed that the majority of the conserved sequences were immunologically relevant: 34 contained numerous predicted human leukocyte antigen (HLA) supertype-restricted peptide sequences, and 26 contained T-cell determinants identified by studies with HLA-transgenic mice and/or reported to be immunogenic in humans.Conclusions/SignificanceForty-four (44) pan-DENV sequences of at least 9 amino acids were highly conserved and identical in 80% or more of all recorded DENV sequences, and the majority were found to be immune-relevant by their correspondence to known or putative HLA-restricted T-cell determinants. The conservation of these sequences through the entire recorded DENV genetic history supports their possible value for diagnosis, prophylactic and/or therapeutic applications. The combination of bioinformatics and experimental approaches applied herein provides a framework for large-scale and systematic analysis of conserved and variable sequences of other pathogens, in particular, for rapidly mutating viruses, such as influenza A virus and HIV.
Partial Text: Dengue viruses (DENVs) are mosquito-borne pathogens of the family Flaviviridae, genus Flavivirus, which are phylogenetically related to other important human pathogens, such as Yellow fever (YFV), Japanese encephalitis (JEV), and West Nile (WNV) viruses, among others. DENVs are enveloped, single-stranded RNA (+) viruses coding for a polyprotein precursor of approximately 3,400 amino acids, which is cleaved into three structural (capsid, C; precursor membrane and membrane, prM/M; envelope, E) and seven nonstructural proteins (NS1, 2a, 2b, 3, 4a, 4b and 5). Viral replication occurs in the cytoplasm in association with virus-induced membrane structures and involves the NS proteins. There are 4 genetically distinct DENV types, referred to as DENV-1 to -4, with multiple genotypic variants ,. DENVs are transmitted to humans primarily by Aedes aegypti mosquitoes and cause a wide range of symptoms from an unapparent or mild dengue fever (DF) to severe dengue hemorrhagic fever (DHF)/dengue shock syndrome (DSS) that may be fatal. It is estimated that more than 100 million people are infected each year, with up to several hundred thousand DHF/DSS cases . To date, there is no licensed prophylactic vaccine and no specific therapeutic formulation available.
In this study, we identified and characterized pan-DENV sequences that were highly conserved in all recorded DENV isolates. The large number of sequences analyzed (12,404 as of December 2007), and their wide distribution in terms of geography and time (1945–2007) (data not shown), offered information for a broad survey of DENV protein diversity in nature. The 44 pan-DENV protein sequences of at least 9 aa, covering 514 aa or about 15% of the complete DENV polyprotein of ∼3390 aa, were conserved in at least 80% of all recorded DENV sequences, and 34 of the 44 (∼77%) were conserved in ≥95% of DENV sequences. All the 44 were in the non-structural proteins except for the two E sequences. These conserved sequences have shown remarkable stability over the entire history of DENV sequences deposited in the NCBI Entrez protein database, as illustrated by their low peptide entropy values and variant frequencies. In addition, 27 of the pan-DENV sequences were conserved in 64 other Flaviviruses, as further evidence of prolonged evolutionary stability within this genus, as previously discussed –. Two are also present in the proteomes of the Aedes albopictus mosquito and the bacteria Chromohalobacter salexigens, possibly in keeping with recent reports of the genetic recombination between phyla . It is likely that these pan-DENV sequences have been under selection pressure to fulfill critical biological and/or structural properties, some of which have been identified for the E (fusion peptide, dimerization domain), NS3 (peptidase S7, DEAD/H domains) and NS5 proteins (MTPase, RdRp domains) –. Hence, these conserved sequences are unlikely to significantly diverge in newly emerging DENV isolates in the future, and represent attractive targets for the development of specific anti-viral compounds and vaccine candidates.