Limits...
A clique-based method for the edit distance between unordered trees and its application to analysis of glycan structures.

Fukagawa D, Tamura T, Takasu A, Tomita E, Akutsu T - BMC Bioinformatics (2011)

Bottom Line: The result suggests that our proposed method can efficiently compute the edit distance for moderate size unordered trees.The proposed method is simple but useful for computation of the edit distance between unordered trees.The object code is available upon request.

View Article: PubMed Central - HTML - PubMed

Affiliation: Faculty of Culture and Information Science, Doshisha University, Kyoto 610-0394, Japan. dfukagaw@mail.doshisha.ac.jp

ABSTRACT

Background: Measuring similarities between tree structured data is important for analysis of RNA secondary structures, phylogenetic trees, glycan structures, and vascular trees. The edit distance is one of the most widely used measures for comparison of tree structured data. However, it is known that computation of the edit distance for rooted unordered trees is NP-hard. Furthermore, there is almost no available software tool that can compute the exact edit distance for unordered trees.

Results: In this paper, we present a practical method for computing the edit distance between rooted unordered trees. In this method, the edit distance problem for unordered trees is transformed into the maximum clique problem and then efficient solvers for the maximum clique problem are applied. We applied the proposed method to similar structure search for glycan structures. The result suggests that our proposed method can efficiently compute the edit distance for moderate size unordered trees. It also suggests that the proposed method has the accuracy comparative to those by the edit distance for ordered trees and by an existing method for glycan search.

Conclusions: The proposed method is simple but useful for computation of the edit distance between unordered trees. The object code is available upon request.

Show MeSH

Related in: MedlinePlus

Comparison of unordered and ordered tree edit distances
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3044267&req=5

Figure 5: Comparison of unordered and ordered tree edit distances

Mentions: Here, we briefly explain the methodological differences among measures. Figure 5 illustrates the difference between unordered tree edit distance and ordered tree edit distance. As shown in Figure 5, suppose that there exist two trees T1 and T2 with roots r1 and r2. Suppose further that r1 has two subtrees A and B, and r2 has two subtrees B′ and A′ in these orders, where A and A′ (B and B′, respectively) are similar to each other, and A is larger than B. If two trees are regarded as ordered, ordered tree edit mapping takes only matching between A and A′ into account. Otherwise, unordered tree mapping takes matchings between A and A′, and between B and B′ into account. Figure 6 illustrates the difference of the deletion operation between tree edit and glycan alignment. In tree edit, all children of the deleted node u become children of the parent v of u. However, in glycan alignment, only one child can be a child of v and the other children are deleted along with their descendants, where the surviving child is chosen so that the resulting score is maximized. It is seen from these figures that the tree edit distance for unordered trees provides the most flexible matching.


A clique-based method for the edit distance between unordered trees and its application to analysis of glycan structures.

Fukagawa D, Tamura T, Takasu A, Tomita E, Akutsu T - BMC Bioinformatics (2011)

Comparison of unordered and ordered tree edit distances
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3044267&req=5

Figure 5: Comparison of unordered and ordered tree edit distances
Mentions: Here, we briefly explain the methodological differences among measures. Figure 5 illustrates the difference between unordered tree edit distance and ordered tree edit distance. As shown in Figure 5, suppose that there exist two trees T1 and T2 with roots r1 and r2. Suppose further that r1 has two subtrees A and B, and r2 has two subtrees B′ and A′ in these orders, where A and A′ (B and B′, respectively) are similar to each other, and A is larger than B. If two trees are regarded as ordered, ordered tree edit mapping takes only matching between A and A′ into account. Otherwise, unordered tree mapping takes matchings between A and A′, and between B and B′ into account. Figure 6 illustrates the difference of the deletion operation between tree edit and glycan alignment. In tree edit, all children of the deleted node u become children of the parent v of u. However, in glycan alignment, only one child can be a child of v and the other children are deleted along with their descendants, where the surviving child is chosen so that the resulting score is maximized. It is seen from these figures that the tree edit distance for unordered trees provides the most flexible matching.

Bottom Line: The result suggests that our proposed method can efficiently compute the edit distance for moderate size unordered trees.The proposed method is simple but useful for computation of the edit distance between unordered trees.The object code is available upon request.

View Article: PubMed Central - HTML - PubMed

Affiliation: Faculty of Culture and Information Science, Doshisha University, Kyoto 610-0394, Japan. dfukagaw@mail.doshisha.ac.jp

ABSTRACT

Background: Measuring similarities between tree structured data is important for analysis of RNA secondary structures, phylogenetic trees, glycan structures, and vascular trees. The edit distance is one of the most widely used measures for comparison of tree structured data. However, it is known that computation of the edit distance for rooted unordered trees is NP-hard. Furthermore, there is almost no available software tool that can compute the exact edit distance for unordered trees.

Results: In this paper, we present a practical method for computing the edit distance between rooted unordered trees. In this method, the edit distance problem for unordered trees is transformed into the maximum clique problem and then efficient solvers for the maximum clique problem are applied. We applied the proposed method to similar structure search for glycan structures. The result suggests that our proposed method can efficiently compute the edit distance for moderate size unordered trees. It also suggests that the proposed method has the accuracy comparative to those by the edit distance for ordered trees and by an existing method for glycan search.

Conclusions: The proposed method is simple but useful for computation of the edit distance between unordered trees. The object code is available upon request.

Show MeSH
Related in: MedlinePlus