Date Published: July 2, 2008
Publisher: Public Library of Science
Author(s): James Wasmuth, Ralf Schmid, Ann Hedley, Mark Blaxter, Elodie Ghedin
Abstract: BackgroundThe phylum Nematoda is biologically diverse, including parasites of plants and animals as well as free-living taxa. Underpinning this diversity will be commensurate diversity in expressed genes, including gene sets associated specifically with evolution of parasitism.Methods and FindingsHere we have analyzed the extensive expressed sequence tag data (available for 37 nematode species, most of which are parasites) and define over 120,000 distinct putative genes from which we have derived robust protein translations. Combined with the complete proteomes of Caenorhabditis elegans and Caenorhabditis briggsae, these proteins have been grouped into 65,000 protein families that in turn contain 40,000 distinct protein domains. We have mapped the occurrence of domains and families across the Nematoda and compared the nematode data to that available for other phyla. Gene loss is common, and in particular we identify nearly 5,000 genes that may have been lost from the lineage leading to the model nematode C. elegans. We find a preponderance of novelty, including 56,000 nematode-restricted protein families and 26,000 nematode-restricted domains. Mapping of the latest time-of-origin of these new families and domains across the nematode phylogeny revealed ongoing evolution of novelty. A number of genes from parasitic species had signatures of horizontal transfer from their host organisms, and parasitic species had a greater proportion of novel, secreted proteins than did free-living ones.ConclusionsThese classes of genes may underpin parasitic phenotypes, and thus may be targets for development of effective control measures.
Partial Text: The vast majority of species are unlikely to be selected for whole genome sequencing, whatever their importance in terms of evolution, health and ecology. The few eukaryote species selected for such projects, despite their utility in laboratory investigation, are unlikely to be representative of the genomic diversity of speciose phyla. For example, Arthropoda and Nematoda have over one million species each , and the ∼20 genomes completed – or in sequencing will illuminate only small parts of their diversity. Expressed sequence tags (ESTs) have proved to be a cost-effective and rapid method for identification of the genes from a target species . Although the largest EST collections have been generated primarily for the annotation of complete genome sequences (e.g. human and mouse), more than half the sequences in GenBank’s EST depository (dbEST)  are from otherwise neglected genomes. One phylum that has benefited from an EST sequencing approach is the Nematoda –.