Limits...
Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals.

Dalquen DA, Dessimoz C - Genome Biol Evol (2013)

Bottom Line: However, limited by their analysis setup, the previous study could not easily test the reverse question: which proportion of orthologs are BBH?In this follow-up study, we consider this question in theory and answer it based on conceptual arguments, simulated data, and real biological data from all three domains of life.Our analyses corroborate the findings of the previous study, but also show that because of the high rate of gene duplication in plants and animals, as much as 60% of orthologous relations are missed by the BBH criterion.

View Article: PubMed Central - PubMed

Affiliation: Computational Biochemistry Research Group, ETH Zurich, Zürich, Switzerland.

ABSTRACT
Bidirectional best hits (BBH), which entails identifying the pairs of genes in two different genomes that are more similar to each other than either is to any other gene in the other genome, is a simple and widely used method to infer orthology. A recent study has analyzed the link between BBH and orthology in bacteria and archaea and concluded that, given the very high consistency in BBH they observed among triplets of neighboring genes, a high proportion of BBH are likely to be bona fide orthologs. However, limited by their analysis setup, the previous study could not easily test the reverse question: which proportion of orthologs are BBH? In this follow-up study, we consider this question in theory and answer it based on conceptual arguments, simulated data, and real biological data from all three domains of life. Our analyses corroborate the findings of the previous study, but also show that because of the high rate of gene duplication in plants and animals, as much as 60% of orthologous relations are missed by the BBH criterion.

Show MeSH

Related in: MedlinePlus

Relationship between the proportion of non-1-to-1 orthology and precision/recall for BBH (in red) on simulated data sets with different proportions of genes with a history of duplications. Results for Inparanoid (green) and OMA/GETHOGs (blue) are given for comparison. Each point corresponds to the mean value of five replicates. Error bars give the 95% confidence interval of the mean values in both dimensions.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3814191&req=5

evt132-F2: Relationship between the proportion of non-1-to-1 orthology and precision/recall for BBH (in red) on simulated data sets with different proportions of genes with a history of duplications. Results for Inparanoid (green) and OMA/GETHOGs (blue) are given for comparison. Each point corresponds to the mean value of five replicates. Error bars give the 95% confidence interval of the mean values in both dimensions.

Mentions: In order to quantify the effect of gene duplication on the proportion of orthologs that are BBH, we simulated datasets of 30 genomes with different duplication rates using the software package ALF (Dalquen et al. 2012; see also Materials and Methods). We then used Basic Alignment Search Tool (Blast) (Altschul et al. 1990) to identify BBH gene pairs and compared these with the true orthologs as given by the simulation program. For comparison, we also analyzed the predictions of Inparanoid (Ostlund et al. 2010) and OMA/GETHOGs (Altenhoff et al. 2013). We computed the trends of the precision (proportion of predicted orthologs that are true orthologs) and the recall (proportion of true orthologs that are correctly predicted) as a function of the true proportion of non-1-to-1 orthology relations, which increases as the gene duplication rate increases. In line with the other two methods, the precision of BBH remained at a very high level with increasing duplication rate, indicating that almost all genes forming BBH are bona fide orthologs (fig. 2a). This part of our analysis corroborates the results of Wolf and Koonin (2012). In contrast and unlike the behavior of the more sophisticated methods, the recall of BBH decreased rapidly with increasing duplication rate (fig. 2b). This behavior indicates that the proportion of orthologs that are BBH decreases as the number of non-1-to-1 orthology relations increases.Fig. 2.—


Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals.

Dalquen DA, Dessimoz C - Genome Biol Evol (2013)

Relationship between the proportion of non-1-to-1 orthology and precision/recall for BBH (in red) on simulated data sets with different proportions of genes with a history of duplications. Results for Inparanoid (green) and OMA/GETHOGs (blue) are given for comparison. Each point corresponds to the mean value of five replicates. Error bars give the 95% confidence interval of the mean values in both dimensions.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3814191&req=5

evt132-F2: Relationship between the proportion of non-1-to-1 orthology and precision/recall for BBH (in red) on simulated data sets with different proportions of genes with a history of duplications. Results for Inparanoid (green) and OMA/GETHOGs (blue) are given for comparison. Each point corresponds to the mean value of five replicates. Error bars give the 95% confidence interval of the mean values in both dimensions.
Mentions: In order to quantify the effect of gene duplication on the proportion of orthologs that are BBH, we simulated datasets of 30 genomes with different duplication rates using the software package ALF (Dalquen et al. 2012; see also Materials and Methods). We then used Basic Alignment Search Tool (Blast) (Altschul et al. 1990) to identify BBH gene pairs and compared these with the true orthologs as given by the simulation program. For comparison, we also analyzed the predictions of Inparanoid (Ostlund et al. 2010) and OMA/GETHOGs (Altenhoff et al. 2013). We computed the trends of the precision (proportion of predicted orthologs that are true orthologs) and the recall (proportion of true orthologs that are correctly predicted) as a function of the true proportion of non-1-to-1 orthology relations, which increases as the gene duplication rate increases. In line with the other two methods, the precision of BBH remained at a very high level with increasing duplication rate, indicating that almost all genes forming BBH are bona fide orthologs (fig. 2a). This part of our analysis corroborates the results of Wolf and Koonin (2012). In contrast and unlike the behavior of the more sophisticated methods, the recall of BBH decreased rapidly with increasing duplication rate (fig. 2b). This behavior indicates that the proportion of orthologs that are BBH decreases as the number of non-1-to-1 orthology relations increases.Fig. 2.—

Bottom Line: However, limited by their analysis setup, the previous study could not easily test the reverse question: which proportion of orthologs are BBH?In this follow-up study, we consider this question in theory and answer it based on conceptual arguments, simulated data, and real biological data from all three domains of life.Our analyses corroborate the findings of the previous study, but also show that because of the high rate of gene duplication in plants and animals, as much as 60% of orthologous relations are missed by the BBH criterion.

View Article: PubMed Central - PubMed

Affiliation: Computational Biochemistry Research Group, ETH Zurich, Zürich, Switzerland.

ABSTRACT
Bidirectional best hits (BBH), which entails identifying the pairs of genes in two different genomes that are more similar to each other than either is to any other gene in the other genome, is a simple and widely used method to infer orthology. A recent study has analyzed the link between BBH and orthology in bacteria and archaea and concluded that, given the very high consistency in BBH they observed among triplets of neighboring genes, a high proportion of BBH are likely to be bona fide orthologs. However, limited by their analysis setup, the previous study could not easily test the reverse question: which proportion of orthologs are BBH? In this follow-up study, we consider this question in theory and answer it based on conceptual arguments, simulated data, and real biological data from all three domains of life. Our analyses corroborate the findings of the previous study, but also show that because of the high rate of gene duplication in plants and animals, as much as 60% of orthologous relations are missed by the BBH criterion.

Show MeSH
Related in: MedlinePlus