Limits...
Impact of the partitioning scheme on divergence times inferred from Mammalian genomic data sets.

Voloch CM, Schrago CG - Evol. Bioinform. Online (2012)

Bottom Line: However, the effect of the partitioning scheme on divergence time estimates has generally been ignored.After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes.Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.

ABSTRACT
Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the impact of data partitioning on the estimation of divergence times, we have constructed two genomic data sets. The first one with 15 nuclear genes comprising 50,928 bp were selected from the OrthoMam database; the second set was composed of complete mitochondrial genomes. We studied two partitioning schemes: concatenated supermatrices and partitioned gene analysis. We have also measured the impact of taxonomic sampling on the estimates. After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes. Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

No MeSH data available.


Phylogenies used in this study. Topologies (A–C) refer to the taxonomic compositions 1 (A), 2 (B) and 3 (C) of the nuclear data set. Topologies (D–F) refer to the taxonomic compositions 1 (C), 2 (D) and 3 (F) of the mitochondrial data set. Phylogenies (C and F) were inferred in PhyML.Note: Black circles indicate nodes in which aLRT value was lower than 0.9, otherwise aLRT statistics = 1.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3362329&req=5

f1-ebo-8-2012-207: Phylogenies used in this study. Topologies (A–C) refer to the taxonomic compositions 1 (A), 2 (B) and 3 (C) of the nuclear data set. Topologies (D–F) refer to the taxonomic compositions 1 (C), 2 (D) and 3 (F) of the mitochondrial data set. Phylogenies (C and F) were inferred in PhyML.Note: Black circles indicate nodes in which aLRT value was lower than 0.9, otherwise aLRT statistics = 1.

Mentions: We have constructed two phylogenomic data sets to investigate the impact of data partitioning on mammalian divergence times. In each data set, chronological inferences were obtained by concatenating all of the genes in a single supermatrix or by allowing the partitions to have independent evolutionary parameters. To evaluate the behavior of the chronological estimates with increasing taxonomic sampling, we have studied three species compositions with increasing numbers of terminals in each data set (Fig. 1).


Impact of the partitioning scheme on divergence times inferred from Mammalian genomic data sets.

Voloch CM, Schrago CG - Evol. Bioinform. Online (2012)

Phylogenies used in this study. Topologies (A–C) refer to the taxonomic compositions 1 (A), 2 (B) and 3 (C) of the nuclear data set. Topologies (D–F) refer to the taxonomic compositions 1 (C), 2 (D) and 3 (F) of the mitochondrial data set. Phylogenies (C and F) were inferred in PhyML.Note: Black circles indicate nodes in which aLRT value was lower than 0.9, otherwise aLRT statistics = 1.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3362329&req=5

f1-ebo-8-2012-207: Phylogenies used in this study. Topologies (A–C) refer to the taxonomic compositions 1 (A), 2 (B) and 3 (C) of the nuclear data set. Topologies (D–F) refer to the taxonomic compositions 1 (C), 2 (D) and 3 (F) of the mitochondrial data set. Phylogenies (C and F) were inferred in PhyML.Note: Black circles indicate nodes in which aLRT value was lower than 0.9, otherwise aLRT statistics = 1.
Mentions: We have constructed two phylogenomic data sets to investigate the impact of data partitioning on mammalian divergence times. In each data set, chronological inferences were obtained by concatenating all of the genes in a single supermatrix or by allowing the partitions to have independent evolutionary parameters. To evaluate the behavior of the chronological estimates with increasing taxonomic sampling, we have studied three species compositions with increasing numbers of terminals in each data set (Fig. 1).

Bottom Line: However, the effect of the partitioning scheme on divergence time estimates has generally been ignored.After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes.Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.

ABSTRACT
Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the impact of data partitioning on the estimation of divergence times, we have constructed two genomic data sets. The first one with 15 nuclear genes comprising 50,928 bp were selected from the OrthoMam database; the second set was composed of complete mitochondrial genomes. We studied two partitioning schemes: concatenated supermatrices and partitioned gene analysis. We have also measured the impact of taxonomic sampling on the estimates. After drawing divergence time inferences using the uncorrelated relaxed clock in BEAST, we have compared the age estimates between the partitioning schemes. Our results show that, in general, both schemes resulted in similar chronological estimates, however the concatenated data sets were more efficient than the partitioned ones in attaining suitable effective sample sizes.

No MeSH data available.