Limits...
An approach of orthology detection from homologous sequences under minimum evolution.

Kim KM, Sung S, Caetano-Anollés G, Han JY, Kim H - Nucleic Acids Res. (2008)

Bottom Line: For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision.Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree.Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.

View Article: PubMed Central - PubMed

Affiliation: Department of Agricultural Biotechnology, Laboratory of Bioinformatics and Population Genetics, Seoul National University, Seoul 151-742, Korea.

ABSTRACT
In the field of phylogenetics and comparative genomics, it is important to establish orthologous relationships when comparing homologous sequences. Due to the slight sequence dissimilarity between orthologs and paralogs, it is prone to regarding paralogs as orthologs. For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision. Depending on their algorithmic implementations, each of these methods sometimes has increased false negative or false positive rates. Here, we developed a novel algorithm for orthology detection that uses a distance method based on the phylogenetic criterion of minimum evolution. Our algorithm assumes that sets of sequences exhibiting orthologous relationships are evolutionarily less costly than sets that include one or more paralogous relationships. Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree. Unlike tree reconciliation, our algorithm appears free from the problem of incorrect topologies of species and gene trees. The reliability of the algorithm was tested in a comparative analysis with two other orthology detection methods using 95 manually curated KOG datasets and 21 experimentally verified EXProt datasets. Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.

Show MeSH
A conceptual representation describing orthologous and paralogous relationships related to gene duplication. (a) A simple phyletic history of gene duplication. In the hypothetical phylogenetic tree, the primary ancestral gene X duplicated into two ancestral descendants α (red) and β (blue). It follows that they have diverged along with speciation into three species. (b) Unfolded phylogenetic trees showing orthologous and paralogous relationships. There are two orthologous clusters [(αA, αB), αC] and [(βA, βB), βC]. Within each orthologous cluster, the letters marked beside a branch indicates the length of a branch. (c) Comparison of minimum evolution scores between orthologous and paralogous relationships.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2553584&req=5

Figure 1: A conceptual representation describing orthologous and paralogous relationships related to gene duplication. (a) A simple phyletic history of gene duplication. In the hypothetical phylogenetic tree, the primary ancestral gene X duplicated into two ancestral descendants α (red) and β (blue). It follows that they have diverged along with speciation into three species. (b) Unfolded phylogenetic trees showing orthologous and paralogous relationships. There are two orthologous clusters [(αA, αB), αC] and [(βA, βB), βC]. Within each orthologous cluster, the letters marked beside a branch indicates the length of a branch. (c) Comparison of minimum evolution scores between orthologous and paralogous relationships.

Mentions: Among the many phylogenetic methods that are used to reconstruct evolutionary history, the maximum parsimony (MP) method selects phylogenetic trees with minimum character changes. The minimum evolution (ME) method, an analog of the MP method that is based on genetic distance, regards a tree with the smallest sum of branch lengths among all possible phylogenetic trees as the most reliable one (17). In this study, we developed an algorithm based on the ME method. The phylogenetic relationships that result from a gene duplication event are represented using a simple tree diagram (Figure 1a). In the tree, two descendants (α and β) have diverged from an ancestral gene along with speciation (Figure 1a), forming two orthologous clusters and some paralogous relationships (Figure 1b). If a subset of homologous sequences consists of orthologs (αA, αB and αC), the sum of branch lengths (SBL) is α1+α2+α3+α4, which is less than SBL (α1+α2+α4+α5+β3+β5) of αA, αB and βC with paralogous relationships between αC and βC (Figure 1c). Therefore, under the ME criterion, it can be postulated that the evolutionary cost of one cluster composed of purely orthologous sequences is less than that of clusters that include paralogous relationships. In this study, we adopted the neighbor-joining (NJ) method, in which SBL is referred to as ‘minimum evolution score’ (MES) and SBL was calculated using the MES of an NJ tree (18).Figure 1.


An approach of orthology detection from homologous sequences under minimum evolution.

Kim KM, Sung S, Caetano-Anollés G, Han JY, Kim H - Nucleic Acids Res. (2008)

A conceptual representation describing orthologous and paralogous relationships related to gene duplication. (a) A simple phyletic history of gene duplication. In the hypothetical phylogenetic tree, the primary ancestral gene X duplicated into two ancestral descendants α (red) and β (blue). It follows that they have diverged along with speciation into three species. (b) Unfolded phylogenetic trees showing orthologous and paralogous relationships. There are two orthologous clusters [(αA, αB), αC] and [(βA, βB), βC]. Within each orthologous cluster, the letters marked beside a branch indicates the length of a branch. (c) Comparison of minimum evolution scores between orthologous and paralogous relationships.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2553584&req=5

Figure 1: A conceptual representation describing orthologous and paralogous relationships related to gene duplication. (a) A simple phyletic history of gene duplication. In the hypothetical phylogenetic tree, the primary ancestral gene X duplicated into two ancestral descendants α (red) and β (blue). It follows that they have diverged along with speciation into three species. (b) Unfolded phylogenetic trees showing orthologous and paralogous relationships. There are two orthologous clusters [(αA, αB), αC] and [(βA, βB), βC]. Within each orthologous cluster, the letters marked beside a branch indicates the length of a branch. (c) Comparison of minimum evolution scores between orthologous and paralogous relationships.
Mentions: Among the many phylogenetic methods that are used to reconstruct evolutionary history, the maximum parsimony (MP) method selects phylogenetic trees with minimum character changes. The minimum evolution (ME) method, an analog of the MP method that is based on genetic distance, regards a tree with the smallest sum of branch lengths among all possible phylogenetic trees as the most reliable one (17). In this study, we developed an algorithm based on the ME method. The phylogenetic relationships that result from a gene duplication event are represented using a simple tree diagram (Figure 1a). In the tree, two descendants (α and β) have diverged from an ancestral gene along with speciation (Figure 1a), forming two orthologous clusters and some paralogous relationships (Figure 1b). If a subset of homologous sequences consists of orthologs (αA, αB and αC), the sum of branch lengths (SBL) is α1+α2+α3+α4, which is less than SBL (α1+α2+α4+α5+β3+β5) of αA, αB and βC with paralogous relationships between αC and βC (Figure 1c). Therefore, under the ME criterion, it can be postulated that the evolutionary cost of one cluster composed of purely orthologous sequences is less than that of clusters that include paralogous relationships. In this study, we adopted the neighbor-joining (NJ) method, in which SBL is referred to as ‘minimum evolution score’ (MES) and SBL was calculated using the MES of an NJ tree (18).Figure 1.

Bottom Line: For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision.Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree.Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.

View Article: PubMed Central - PubMed

Affiliation: Department of Agricultural Biotechnology, Laboratory of Bioinformatics and Population Genetics, Seoul National University, Seoul 151-742, Korea.

ABSTRACT
In the field of phylogenetics and comparative genomics, it is important to establish orthologous relationships when comparing homologous sequences. Due to the slight sequence dissimilarity between orthologs and paralogs, it is prone to regarding paralogs as orthologs. For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision. Depending on their algorithmic implementations, each of these methods sometimes has increased false negative or false positive rates. Here, we developed a novel algorithm for orthology detection that uses a distance method based on the phylogenetic criterion of minimum evolution. Our algorithm assumes that sets of sequences exhibiting orthologous relationships are evolutionarily less costly than sets that include one or more paralogous relationships. Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree. Unlike tree reconciliation, our algorithm appears free from the problem of incorrect topologies of species and gene trees. The reliability of the algorithm was tested in a comparative analysis with two other orthology detection methods using 95 manually curated KOG datasets and 21 experimentally verified EXProt datasets. Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.

Show MeSH