Limits...
Impact of the partitioning scheme on divergence times inferred from Mammalian genomic data sets.

Voloch CM, Schrago CG - Evol. Bioinform. Online (2012)

Bottom Line: However, the effect of the partitioning scheme on divergence time estimates has generally been ignored.After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes.Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.

ABSTRACT
Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the impact of data partitioning on the estimation of divergence times, we have constructed two genomic data sets. The first one with 15 nuclear genes comprising 50,928 bp were selected from the OrthoMam database; the second set was composed of complete mitochondrial genomes. We studied two partitioning schemes: concatenated supermatrices and partitioned gene analysis. We have also measured the impact of taxonomic sampling on the estimates. After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes. Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

No MeSH data available.


Phylogenies used in this study.Notes: The magnitude of the difference between the node ages from the concatenated and partitioned schemes is represented in each node using the color scale shown at the bottom of the figure.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3362329&req=5

f7-ebo-8-2012-207: Phylogenies used in this study.Notes: The magnitude of the difference between the node ages from the concatenated and partitioned schemes is represented in each node using the color scale shown at the bottom of the figure.

Mentions: One of our findings was that the greatest difference between the partitioning schemes occurred for the nodes that were close to the basal Laurasiatheria split. In the nuclear data set, these nodes were the Atlantogenata/ Boreoeutheria separation, the basal Boreoeutheria divergence, the split between insectivores and Ferungulata (basal Laurasiatheria), and the basal Ferungulata divergence (Fig. 7). The basal Ferungulata split was also dated at discrepant ages by both of the schemes for the mitochondrial data set (Fig. 7). One of the reasons for the lower efficiency of the nuclear data set for these nodes might be associated with the large variation commonly found in the coalescence times of nuclear genes.29 Actually, the resolution of the phylogenetic branching between the superorders of Mammalia and the early evolution of Laurasiatheria are the most difficult problems in mammalian phylogenomics.16,17,30


Impact of the partitioning scheme on divergence times inferred from Mammalian genomic data sets.

Voloch CM, Schrago CG - Evol. Bioinform. Online (2012)

Phylogenies used in this study.Notes: The magnitude of the difference between the node ages from the concatenated and partitioned schemes is represented in each node using the color scale shown at the bottom of the figure.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3362329&req=5

f7-ebo-8-2012-207: Phylogenies used in this study.Notes: The magnitude of the difference between the node ages from the concatenated and partitioned schemes is represented in each node using the color scale shown at the bottom of the figure.
Mentions: One of our findings was that the greatest difference between the partitioning schemes occurred for the nodes that were close to the basal Laurasiatheria split. In the nuclear data set, these nodes were the Atlantogenata/ Boreoeutheria separation, the basal Boreoeutheria divergence, the split between insectivores and Ferungulata (basal Laurasiatheria), and the basal Ferungulata divergence (Fig. 7). The basal Ferungulata split was also dated at discrepant ages by both of the schemes for the mitochondrial data set (Fig. 7). One of the reasons for the lower efficiency of the nuclear data set for these nodes might be associated with the large variation commonly found in the coalescence times of nuclear genes.29 Actually, the resolution of the phylogenetic branching between the superorders of Mammalia and the early evolution of Laurasiatheria are the most difficult problems in mammalian phylogenomics.16,17,30

Bottom Line: However, the effect of the partitioning scheme on divergence time estimates has generally been ignored.After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes.Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.

ABSTRACT
Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the impact of data partitioning on the estimation of divergence times, we have constructed two genomic data sets. The first one with 15 nuclear genes comprising 50,928 bp were selected from the OrthoMam database; the second set was composed of complete mitochondrial genomes. We studied two partitioning schemes: concatenated supermatrices and partitioned gene analysis. We have also measured the impact of taxonomic sampling on the estimates. After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes. Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

No MeSH data available.