Limits...
Statistical properties of pairwise distances between leaves on a random Yule tree.

Sheinman M, Massip F, Arndt PF - PLoS ONE (2015)

Bottom Line: A Yule tree is the result of a branching process with constant birth and death rates.To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies.We show that our result can account for empirical data, using two families of birds species.

View Article: PubMed Central - PubMed

Affiliation: Max Planck Institute for Molecular Genetics, Berlin, Germany.

ABSTRACT
A Yule tree is the result of a branching process with constant birth and death rates. Such a process serves as an instructive model of many empirical systems, for instance, the evolution of species leading to a phylogenetic tree. However, often in phylogeny the only available information is the pairwise distances between a small fraction of extant species representing the leaves of the tree. In this article we study statistical properties of the pairwise distances in a Yule tree. Using a method based on a recursion, we derive an exact, analytic and compact formula for the expected number of pairs separated by a certain time distance. This number turns out to follow a increasing exponential function. This property of a Yule tree can serve as a simple test for empirical data to be well described by a Yule process. We further use this recursive method to calculate the expected number of the n-most closely related pairs of leaves and the number of cherries separated by a certain time distance. To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies. We show that our result can account for empirical data, using two families of birds species.

No MeSH data available.


Comparison of analytic predictions to the pairwise distances data of Tyrannidae family with M = 460 species taken from the database [28] with t ≤ 0.8 × 108Myr.The markers represent the empirical data, while the lines represent the analytic formulas with fitted parameters. (a) Pairwise distance distribution. (b) Minimal distance distribution. (c-e) n-minimal distance distribution. (d) Cherries distance distribution. The fit is performed for all points in the figure with t ≤ 0.5 to avoid clear break down of the Yule tree assumptions for larger distances (see text). The lines are based on following set of parameters: λ−μ = 8 × 10−8yr−1 and λσ = 6.4 × 10−8yr−1. For μ = 0, 0.2, 0.4, 0.6, 0.8 × λ this corresponds respectively to σ = 0.8, 0.64, 0.48, 0.32, 0.16.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4380457&req=5

pone.0120206.g006: Comparison of analytic predictions to the pairwise distances data of Tyrannidae family with M = 460 species taken from the database [28] with t ≤ 0.8 × 108Myr.The markers represent the empirical data, while the lines represent the analytic formulas with fitted parameters. (a) Pairwise distance distribution. (b) Minimal distance distribution. (c-e) n-minimal distance distribution. (d) Cherries distance distribution. The fit is performed for all points in the figure with t ≤ 0.5 to avoid clear break down of the Yule tree assumptions for larger distances (see text). The lines are based on following set of parameters: λ−μ = 8 × 10−8yr−1 and λσ = 6.4 × 10−8yr−1. For μ = 0, 0.2, 0.4, 0.6, 0.8 × λ this corresponds respectively to σ = 0.8, 0.64, 0.48, 0.32, 0.16.

Mentions: In this Section we demonstrate the relevance of the obtained analytic formulas to empirical data, studying the pairwise distances between species in families of the evolutionary tree. For comparison with the derived results we choose N(t∣T), Nn(t∣T) with n = 1, 2, 3, 4 and NΛ(t∣T). The results are presented in Fig. 5 for the Siilvidae family of birds (see one of the reconstructed trees for this family and its distance matrix in Fig. 1) and for the Tyrannidae family of birds in Fig. 6. For every family we analyze Bayesian sampling of 1000 trees downloaded from the database [28]. Namely, we collect pairwise distances, n-minimal distances and distances between cherries of all 1000 trees and plot the histograms of these distances (with the y-axis divided by 1000) in Figs. 5 and 6. We fit all the points in a figure using the iterative reweighted least squares algorithm [29] in Matlab. Unfortunately, the explicit dependencies on λ and μ in Equations (14, 20, 21) are insufficient to estimate all parameters. Instead one can estimate from the fit only the effective growth rate, λ−μ and λσ. The value of σ can be obtained assuming a certain ratio μ/λ. In the captions of Figs. 5 and 6 we present the obtained estimates for σ for different assumptions about the ratio μ/λ.


Statistical properties of pairwise distances between leaves on a random Yule tree.

Sheinman M, Massip F, Arndt PF - PLoS ONE (2015)

Comparison of analytic predictions to the pairwise distances data of Tyrannidae family with M = 460 species taken from the database [28] with t ≤ 0.8 × 108Myr.The markers represent the empirical data, while the lines represent the analytic formulas with fitted parameters. (a) Pairwise distance distribution. (b) Minimal distance distribution. (c-e) n-minimal distance distribution. (d) Cherries distance distribution. The fit is performed for all points in the figure with t ≤ 0.5 to avoid clear break down of the Yule tree assumptions for larger distances (see text). The lines are based on following set of parameters: λ−μ = 8 × 10−8yr−1 and λσ = 6.4 × 10−8yr−1. For μ = 0, 0.2, 0.4, 0.6, 0.8 × λ this corresponds respectively to σ = 0.8, 0.64, 0.48, 0.32, 0.16.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4380457&req=5

pone.0120206.g006: Comparison of analytic predictions to the pairwise distances data of Tyrannidae family with M = 460 species taken from the database [28] with t ≤ 0.8 × 108Myr.The markers represent the empirical data, while the lines represent the analytic formulas with fitted parameters. (a) Pairwise distance distribution. (b) Minimal distance distribution. (c-e) n-minimal distance distribution. (d) Cherries distance distribution. The fit is performed for all points in the figure with t ≤ 0.5 to avoid clear break down of the Yule tree assumptions for larger distances (see text). The lines are based on following set of parameters: λ−μ = 8 × 10−8yr−1 and λσ = 6.4 × 10−8yr−1. For μ = 0, 0.2, 0.4, 0.6, 0.8 × λ this corresponds respectively to σ = 0.8, 0.64, 0.48, 0.32, 0.16.
Mentions: In this Section we demonstrate the relevance of the obtained analytic formulas to empirical data, studying the pairwise distances between species in families of the evolutionary tree. For comparison with the derived results we choose N(t∣T), Nn(t∣T) with n = 1, 2, 3, 4 and NΛ(t∣T). The results are presented in Fig. 5 for the Siilvidae family of birds (see one of the reconstructed trees for this family and its distance matrix in Fig. 1) and for the Tyrannidae family of birds in Fig. 6. For every family we analyze Bayesian sampling of 1000 trees downloaded from the database [28]. Namely, we collect pairwise distances, n-minimal distances and distances between cherries of all 1000 trees and plot the histograms of these distances (with the y-axis divided by 1000) in Figs. 5 and 6. We fit all the points in a figure using the iterative reweighted least squares algorithm [29] in Matlab. Unfortunately, the explicit dependencies on λ and μ in Equations (14, 20, 21) are insufficient to estimate all parameters. Instead one can estimate from the fit only the effective growth rate, λ−μ and λσ. The value of σ can be obtained assuming a certain ratio μ/λ. In the captions of Figs. 5 and 6 we present the obtained estimates for σ for different assumptions about the ratio μ/λ.

Bottom Line: A Yule tree is the result of a branching process with constant birth and death rates.To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies.We show that our result can account for empirical data, using two families of birds species.

View Article: PubMed Central - PubMed

Affiliation: Max Planck Institute for Molecular Genetics, Berlin, Germany.

ABSTRACT
A Yule tree is the result of a branching process with constant birth and death rates. Such a process serves as an instructive model of many empirical systems, for instance, the evolution of species leading to a phylogenetic tree. However, often in phylogeny the only available information is the pairwise distances between a small fraction of extant species representing the leaves of the tree. In this article we study statistical properties of the pairwise distances in a Yule tree. Using a method based on a recursion, we derive an exact, analytic and compact formula for the expected number of pairs separated by a certain time distance. This number turns out to follow a increasing exponential function. This property of a Yule tree can serve as a simple test for empirical data to be well described by a Yule process. We further use this recursive method to calculate the expected number of the n-most closely related pairs of leaves and the number of cherries separated by a certain time distance. To make our results more useful for realistic scenarios, we explicitly take into account that the leaves of a tree may be incompletely sampled and derive a criterion for poorly sampled phylogenies. We show that our result can account for empirical data, using two families of birds species.

No MeSH data available.