Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics.
Bottom Line: The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs.We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs.They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines.
Affiliation: Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor email@example.com firstname.lastname@example.org.Show MeSH
Mentions: Homology and orthology inference were conducted using the methods as described above (for more details see Materials and Methods). The resulting ortholog occupancy curves were convex for HYM and GRP (fig. 2), indicating a high number of orthologs containing high percentage of taxa, whereas the almost straight curves for MIL indicate that relatively few orthologs have high percentage of taxa. The shapes were determined by the divergence time and the completeness of sequences in individual taxon (annotated genome vs. transcriptome/low-coverage genomes), whereas the orthology inference methods shifted the height and the slope of the curves.Fig. 2.
Affiliation: Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor email@example.com firstname.lastname@example.org.