Research Article: Locus-aware decomposition of gene trees with respect to polytomous species trees

Date Published: June 4, 2018

Publisher: BioMed Central

Author(s): Michał Aleksander Ciach, Anna Muszewska, Paweł Górecki.

http://doi.org/10.1186/s13015-018-0128-1

Abstract

Horizontal gene transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values.

We investigate the locus tree inference problem as a possible alternative that combines the advantages of both approaches. We present several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification.

Our approach allows for faster comparisons of gene and species trees and improves known algorithms for duplication inference in the presence of polytomies in the species trees. We have implemented our algorithms in a software tool available at https://github.com/mciach/LocusTreeInference.

Partial Text

Horizontal gene transfer (HGT) is the process of acquisition and fixation of foreign genetic material. It can lead to substantial changes in the ecology and evolution of recipient organism, sometimes leading to the emergence of new pathogens [1]. HGT is interesting both from biological and computational perspective. Several methods of detecting horizontally transferred genes have been proposed, which can be roughly divided into two categories [2]. So-called surrogate methods are computationally efficient, yet often imprecise. The other group consists of the phylogenetic methods, most notably the tree reconciliation [3].

We have run numerical simulations to assess the performance of the proposed algorithms. First, we have run the heuristic and dynamic programming algorithms on pairs of random trees to compare the inferred forest sizes and loss costs. Next, we have performed simulations of realistic evolutionary scenarios to check the correctness of the algorithms’ results.

In this work, we have investigated two new problems, LTI and CLTI, for locus tree inference in a parsimony framework, defined as problems of decomposing a gene tree into trees consistent with a species tree. We have proposed a new mapping, called the highest separating rank, which has been applied to improve the classification of duplications and to solve CLTI. We have proposed two memory and time efficient solutions to the proposed problems: an documentclass[12pt]{minimal}
usepackage{amsmath}
usepackage{wasysym}
usepackage{amsfonts}
usepackage{amssymb}
usepackage{amsbsy}
usepackage{mathrsfs}
usepackage{upgreek}
setlength{oddsidemargin}{-69pt}
begin{document}$$O(n^2)$$end{document}O(n2) dynamic programming algorithm for LTI and a near linear time heuristic for CLTI designed to solve large instances. Next, to verify the evolutionary consistency of the output our algorithms, we have proposed a validation method based on the model of evolutionary scenarios with HGTs. Finally, we have performed a number of tests on simulated data showing that these algorithms detect evolutionary events with high accuracy, and performed a proof-of-concept analysis of an empirical gene tree. Our results suggest that the new mapping, combined with the lca-mapping, can be used to locate cases of gene duplications and horizontal gene transfers.

 

Source:

http://doi.org/10.1186/s13015-018-0128-1

 

Leave a Reply

Your email address will not be published.