Limits...
General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability.

Göker M, Grimm GW - BMC Evol. Biol. (2008)

Bottom Line: The results agree well with these three measures and the datasets examined as well as with the theoretical predictions and previous results in the literature.Regarding cloned sequences, the formulae have a high potential to accurately reflect evolutionary relationships within angiosperm genera, and to identify hybrids and ancestral taxa.These results corroborate earlier ones which showed that treelikeness measures are a valuable tool in comparative studies of biological distance functions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Organismic Botany, Eberhard-Karls-University, Auf der Morgenstelle 1, Tübingen, Germany. markus.goeker@uni-tuebingen.de

ABSTRACT

Background: Amongst the most commonly used molecular markers for plant phylogenetic studies are the nuclear ribosomal internal transcribed spacers (ITS). Intra-individual variability of these multicopy regions is a very common phenomenon in plants, the causes of which are debated in literature. Phylogenetic reconstruction under these conditions is inherently difficult. Our approach is to consider this problem as a special case of the general biological question of how to infer the characteristics of hosts (represented here by plant individuals) from features of their associates (represented by cloned sequences here).

Results: Six general transformation functions are introduced, covering the transformation of associate characters to discrete and continuous host characters, and the transformation of associate distances to host distances. A pure distance-based framework is established in which these transformation functions are applied to ITS sequences collected from the angiosperm genera Acer, Fagus and Zelkova. The formulae are also applied to allelic data of three different loci obtained from Rosa spp. The functions are validated by (1) phylogeny-independent measures of treelikeness; (2) correlation with independent host characters; (3) visualization using splits graphs and comparison with published data on the test organisms. The results agree well with these three measures and the datasets examined as well as with the theoretical predictions and previous results in the literature. High-quality distance matrices are obtained with four of the six transformation formulae. We demonstrate that one of them represents a generalization of the Sørensen coefficient, which is widely applied in ecology.

Conclusion: Because of their generality, the transformation functions may be applied to a wide range of biological problems that are interpretable in terms of hosts and associates. Regarding cloned sequences, the formulae have a high potential to accurately reflect evolutionary relationships within angiosperm genera, and to identify hybrids and ancestral taxa. These results corroborate earlier ones which showed that treelikeness measures are a valuable tool in comparative studies of biological distance functions.

Show MeSH

Related in: MedlinePlus

Data source vs. Delta values. Delta values (computed with DIST_STATS) of distance matrices obtained with a minimum of three associates plotted against data sources, i.e. the plant genera from which the cloned ITS sequences were obtained. Lower Delta values indicate higher treelikeness of the distance matrices. The boxplots indicate the positions of medians (thick horizontal lines), quartiles (boxes), outliers (short horizontal lines connected to the box with dashed lines), and extreme values (open circles).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2291458&req=5

Figure 2: Data source vs. Delta values. Delta values (computed with DIST_STATS) of distance matrices obtained with a minimum of three associates plotted against data sources, i.e. the plant genera from which the cloned ITS sequences were obtained. Lower Delta values indicate higher treelikeness of the distance matrices. The boxplots indicate the positions of medians (thick horizontal lines), quartiles (boxes), outliers (short horizontal lines connected to the box with dashed lines), and extreme values (open circles).

Mentions: Regarding Delta values (DV) for entire distance matrices, results differed between data sources (Fig. 2; all DV are provided in Additional file 1, along with all the distance matrices obtained). Acer distances showed relatively low values, indicating high treelikeness; variability between the different methods applied was low. The DV for Acer distances were mostly located between 0.189 and 0.245, and received higher values with ENT distances only. In Fagus, the DV were much higher in general (0.214–0.346) and displayed more differences between the distance methods applied. In Zelkova, the overall highest treelikeness was observed (a DV of 0.132 was obtained with MOD distances and gaps treated as a 5th character state), but variability between the distance functions was also highest (up to a DV of 0.314 for ENT distances combined, with gaps treated as missing data).


General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability.

Göker M, Grimm GW - BMC Evol. Biol. (2008)

Data source vs. Delta values. Delta values (computed with DIST_STATS) of distance matrices obtained with a minimum of three associates plotted against data sources, i.e. the plant genera from which the cloned ITS sequences were obtained. Lower Delta values indicate higher treelikeness of the distance matrices. The boxplots indicate the positions of medians (thick horizontal lines), quartiles (boxes), outliers (short horizontal lines connected to the box with dashed lines), and extreme values (open circles).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2291458&req=5

Figure 2: Data source vs. Delta values. Delta values (computed with DIST_STATS) of distance matrices obtained with a minimum of three associates plotted against data sources, i.e. the plant genera from which the cloned ITS sequences were obtained. Lower Delta values indicate higher treelikeness of the distance matrices. The boxplots indicate the positions of medians (thick horizontal lines), quartiles (boxes), outliers (short horizontal lines connected to the box with dashed lines), and extreme values (open circles).
Mentions: Regarding Delta values (DV) for entire distance matrices, results differed between data sources (Fig. 2; all DV are provided in Additional file 1, along with all the distance matrices obtained). Acer distances showed relatively low values, indicating high treelikeness; variability between the different methods applied was low. The DV for Acer distances were mostly located between 0.189 and 0.245, and received higher values with ENT distances only. In Fagus, the DV were much higher in general (0.214–0.346) and displayed more differences between the distance methods applied. In Zelkova, the overall highest treelikeness was observed (a DV of 0.132 was obtained with MOD distances and gaps treated as a 5th character state), but variability between the distance functions was also highest (up to a DV of 0.314 for ENT distances combined, with gaps treated as missing data).

Bottom Line: The results agree well with these three measures and the datasets examined as well as with the theoretical predictions and previous results in the literature.Regarding cloned sequences, the formulae have a high potential to accurately reflect evolutionary relationships within angiosperm genera, and to identify hybrids and ancestral taxa.These results corroborate earlier ones which showed that treelikeness measures are a valuable tool in comparative studies of biological distance functions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Organismic Botany, Eberhard-Karls-University, Auf der Morgenstelle 1, Tübingen, Germany. markus.goeker@uni-tuebingen.de

ABSTRACT

Background: Amongst the most commonly used molecular markers for plant phylogenetic studies are the nuclear ribosomal internal transcribed spacers (ITS). Intra-individual variability of these multicopy regions is a very common phenomenon in plants, the causes of which are debated in literature. Phylogenetic reconstruction under these conditions is inherently difficult. Our approach is to consider this problem as a special case of the general biological question of how to infer the characteristics of hosts (represented here by plant individuals) from features of their associates (represented by cloned sequences here).

Results: Six general transformation functions are introduced, covering the transformation of associate characters to discrete and continuous host characters, and the transformation of associate distances to host distances. A pure distance-based framework is established in which these transformation functions are applied to ITS sequences collected from the angiosperm genera Acer, Fagus and Zelkova. The formulae are also applied to allelic data of three different loci obtained from Rosa spp. The functions are validated by (1) phylogeny-independent measures of treelikeness; (2) correlation with independent host characters; (3) visualization using splits graphs and comparison with published data on the test organisms. The results agree well with these three measures and the datasets examined as well as with the theoretical predictions and previous results in the literature. High-quality distance matrices are obtained with four of the six transformation formulae. We demonstrate that one of them represents a generalization of the Sørensen coefficient, which is widely applied in ecology.

Conclusion: Because of their generality, the transformation functions may be applied to a wide range of biological problems that are interpretable in terms of hosts and associates. Regarding cloned sequences, the formulae have a high potential to accurately reflect evolutionary relationships within angiosperm genera, and to identify hybrids and ancestral taxa. These results corroborate earlier ones which showed that treelikeness measures are a valuable tool in comparative studies of biological distance functions.

Show MeSH
Related in: MedlinePlus