Do triplets have enough information to construct the multi-labeled phylogenetic tree?
Bottom Line:
In this paper, we show that the SMRT does not seem to be an appropriate solution from the biological point of view.The results of MTRT show that triplets alone cannot provide enough information to infer the true MUL tree.Finally, we introduce some new problems which are more suitable from the biological point of view.
View Article:
PubMed Central - PubMed
Affiliation: Department of Mathematics, Shahid Beheshti University, G.C., Tehran, Iran.
ABSTRACT
Show MeSH
The evolutionary history of certain species such as polyploids are modeled by a generalization of phylogenetic trees called multi-labeled phylogenetic trees, or MUL trees for short. One problem that relates to inferring a MUL tree is how to construct the smallest possible MUL tree that is consistent with a given set of rooted triplets, or SMRT problem for short. This problem is NP-hard. There is one algorithm for the SMRT problem which is exact and runs in O(7n) time, where n is the number of taxa. In this paper, we show that the SMRT does not seem to be an appropriate solution from the biological point of view. Indeed, we present a heuristic algorithm named MTRT for this problem and execute it on some real and simulated datasets. The results of MTRT show that triplets alone cannot provide enough information to infer the true MUL tree. So, it is inappropriate to infer a MUL tree using triplet information alone and considering the minimum number of duplications. Finally, we introduce some new problems which are more suitable from the biological point of view. |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC4117514&req=5
Mentions: To test the performance of the MTRT on real biological datasets, we applied MTRT on three datasets. The first and second datasets containing high-polyploid North American and Hawaiian violets [17]. All major morphological groups occurring in North America were sampled. All sequence were aligned with MUSCLE [7] and phylogenies were constructed using maximum likelihood. The third dataset containing the flowering plant genus Silene (Caryophyllaceae) was published in [21]. The gene trees in [21] are reconstructed using standard techniques in phylogenetic analysis from regions of the nuclear RNA polymerase gene family, two concatenated chloroplast regions and one nuclear ribosomal region, see [10] for more details. For each original MUL tree, we extracted all triplets and then apply MTRT on these triplets. In all cases, MTRT constructs a MUL tree which has less number of duplications than that of the original MUL tree. The original MUL trees for first and second datasets have 13 and 20 duplications, whereas the MUL trees produced by MTRT have 11 and 18 duplications respectively. Due to limitations of space, the MUL trees associated with one of the data are shown. Figure 3 and Figure 4 show the original MUL tree and the MUL tree constructed by MTRT for the triplet set extracted from the original MUL tree respectively. The original MUL tree for third dataset has 7 duplications, whereas the MUL tree produced by MTRT has 5 duplications. Figure 5 and Figure 6 show the original MUL tree and the MUL tree constructed by MTRT respectively. The labels represent Silene species, namely, S. ajanensis (A), S. uralensis (U), S. involucrata (I), S. sorensenis (S), S. ostenfeldii (O), S. zawadskii (Z), S. linnaeana (L), S. uralensis (Mongolia) (UM), S. samojedora (SAM), S. villosula (V), S. sachalinensis (SAC) and S. tolmatchevii (T). |
View Article: PubMed Central - PubMed
Affiliation: Department of Mathematics, Shahid Beheshti University, G.C., Tehran, Iran.