Statistical properties of pairwise distances between leaves on a random Yule tree.
Bottom Line:
A Yule tree is the result of a branching process with constant birth and death rates.To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies.We show that our result can account for empirical data, using two families of birds species.
View Article:
PubMed Central - PubMed
Affiliation: Max Planck Institute for Molecular Genetics, Berlin, Germany.
ABSTRACT
A Yule tree is the result of a branching process with constant birth and death rates. Such a process serves as an instructive model of many empirical systems, for instance, the evolution of species leading to a phylogenetic tree. However, often in phylogeny the only available information is the pairwise distances between a small fraction of extant species representing the leaves of the tree. In this article we study statistical properties of the pairwise distances in a Yule tree. Using a method based on a recursion, we derive an exact, analytic and compact formula for the expected number of pairs separated by a certain time distance. This number turns out to follow a increasing exponential function. This property of a Yule tree can serve as a simple test for empirical data to be well described by a Yule process. We further use this recursive method to calculate the expected number of the n-most closely related pairs of leaves and the number of cherries separated by a certain time distance. To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies. We show that our result can account for empirical data, using two families of birds species. No MeSH data available. |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC4380457&req=5
Mentions: In this Section we demonstrate the relevance of the obtained analytic formulas to empirical data, studying the pairwise distances between species in families of the evolutionary tree. For comparison with the derived results we choose N(t∣T), Nn(t∣T) with n = 1, 2, 3, 4 and NΛ(t∣T). The results are presented in Fig. 5 for the Siilvidae family of birds (see one of the reconstructed trees for this family and its distance matrix in Fig. 1) and for the Tyrannidae family of birds in Fig. 6. For every family we analyze Bayesian sampling of 1000 trees downloaded from the database [28]. Namely, we collect pairwise distances, n-minimal distances and distances between cherries of all 1000 trees and plot the histograms of these distances (with the y-axis divided by 1000) in Figs. 5 and 6. We fit all the points in a figure using the iterative reweighted least squares algorithm [29] in Matlab. Unfortunately, the explicit dependencies on λ and μ in Equations (14, 20, 21) are insufficient to estimate all parameters. Instead one can estimate from the fit only the effective growth rate, λ−μ and λσ. The value of σ can be obtained assuming a certain ratio μ/λ. In the captions of Figs. 5 and 6 we present the obtained estimates for σ for different assumptions about the ratio μ/λ. |
View Article: PubMed Central - PubMed
Affiliation: Max Planck Institute for Molecular Genetics, Berlin, Germany.
No MeSH data available.