Limits...
Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics.

Yang Y, Smith SA - Mol. Biol. Evol. (2014)

Bottom Line: The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs.We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs.They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor yangya@umich.edu eebsmith@umich.edu.

Show MeSH

Related in: MedlinePlus

Maximum-likelihood analysis of the HYM data set. Taxon names were abbreviated to the first four letters of the genus names except the left-most tree. Orthology inference methods: MI, maximum inclusion; RT, extracting rooted ingroup clades; MO, monophyletic outgroups; 1to1, filtered one-to-one orthologs. All nodes received bootstrap and 30% jackknife support values of 100 and are not shown. Node labels are also not shown if all support values are 100. Arrows indicate nodes with relatively low support.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4209138&req=5

msu245-F3: Maximum-likelihood analysis of the HYM data set. Taxon names were abbreviated to the first four letters of the genus names except the left-most tree. Orthology inference methods: MI, maximum inclusion; RT, extracting rooted ingroup clades; MO, monophyletic outgroups; 1to1, filtered one-to-one orthologs. All nodes received bootstrap and 30% jackknife support values of 100 and are not shown. Node labels are also not shown if all support values are 100. Arrows indicate nodes with relatively low support.

Mentions: Species trees reconstructed from the HYM data set were overall highly consistent among all four orthology inference methods in topology, branch lengths, and support values (fig. 3). They had identical topologies to those in the analysis by Johnson et al. (2013). All branches received a support value of 100% from both the bootstrap and 30% jackknife analyses. Branches received less-than-perfect support values using STAR or PhyloNet in Johnson et al. (2013) similarly received less-than-perfect jackknife support values in our 10% and/or 20 gene jackknife analyses. The node uniting Formicidae and Apoidea (marked with an arrow in fig. 3) received 81–97% jackknife support with around 100 loci and around 60% with 20 loci. Given that five of the nine taxa in this clade were from annotated genomes and the entire tree was otherwise well supported, this Formicidae + Apoidea node warrants further investigation of the source of the conflict.Fig. 3.


Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics.

Yang Y, Smith SA - Mol. Biol. Evol. (2014)

Maximum-likelihood analysis of the HYM data set. Taxon names were abbreviated to the first four letters of the genus names except the left-most tree. Orthology inference methods: MI, maximum inclusion; RT, extracting rooted ingroup clades; MO, monophyletic outgroups; 1to1, filtered one-to-one orthologs. All nodes received bootstrap and 30% jackknife support values of 100 and are not shown. Node labels are also not shown if all support values are 100. Arrows indicate nodes with relatively low support.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4209138&req=5

msu245-F3: Maximum-likelihood analysis of the HYM data set. Taxon names were abbreviated to the first four letters of the genus names except the left-most tree. Orthology inference methods: MI, maximum inclusion; RT, extracting rooted ingroup clades; MO, monophyletic outgroups; 1to1, filtered one-to-one orthologs. All nodes received bootstrap and 30% jackknife support values of 100 and are not shown. Node labels are also not shown if all support values are 100. Arrows indicate nodes with relatively low support.
Mentions: Species trees reconstructed from the HYM data set were overall highly consistent among all four orthology inference methods in topology, branch lengths, and support values (fig. 3). They had identical topologies to those in the analysis by Johnson et al. (2013). All branches received a support value of 100% from both the bootstrap and 30% jackknife analyses. Branches received less-than-perfect support values using STAR or PhyloNet in Johnson et al. (2013) similarly received less-than-perfect jackknife support values in our 10% and/or 20 gene jackknife analyses. The node uniting Formicidae and Apoidea (marked with an arrow in fig. 3) received 81–97% jackknife support with around 100 loci and around 60% with 20 loci. Given that five of the nine taxa in this clade were from annotated genomes and the entire tree was otherwise well supported, this Formicidae + Apoidea node warrants further investigation of the source of the conflict.Fig. 3.

Bottom Line: The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs.We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs.They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor yangya@umich.edu eebsmith@umich.edu.

Show MeSH
Related in: MedlinePlus