Limits...
CoMOGrad and PHOG: From Computer Vision to Fast and Accurate Protein Tertiary Structure Retrieval.

Karim R, Aziz MM, Shatabda S, Rahman MS, Mia MA, Zaman F, Rakin S - Sci Rep (2015)

Bottom Line: Our proposed methods borrow ideas from the field of computer vision.The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure.Experimental results clearly indicate the superiority of our approach in both running time and accuracy.

View Article: PubMed Central - PubMed

Affiliation: AlEDA Group, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Bangladesh.

ABSTRACT
The number of entries in a structural database of proteins is increasing day by day. Methods for retrieving protein tertiary structures from such a large database have turn out to be the key to comparative analysis of structures that plays an important role to understand proteins and their functions. In this paper, we present fast and accurate methods for the retrieval of proteins having tertiary structures similar to a query protein from a large database. Our proposed methods borrow ideas from the field of computer vision. The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure. Experimental results clearly indicate the superiority of our approach in both running time and accuracy. Our method is readily available for use from this website: http://research.buet.ac.bd:8080/Comograd/.

No MeSH data available.


Level 1 quad tree of α carbon distance matrix image.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4543952&req=5

f3: Level 1 quad tree of α carbon distance matrix image.

Mentions: The use of CoMOGrad gives us an ultra fast structure retrieval algorithm. However it achieves this speed at the cost of some reduction in accuracy. From the discussion in the previous sections, it is clear that we have to incorporate the gradient magnitude and spatial orientation of gradient along with the angular orientation of gradient to accurately capture the tertiary structure of a protein. The CoMOGrad feature only includes the angular orientation of gradient and the co-occurrence of angular orientation of gradient. To incorporate the gradient magnitude and the spatial orientation of gradient along with the angular orientation of gradient, we extract another feature named pyramid histogram of oriented gradient (PHOG) together with our CoMOGrad feature to improve the accuracy. PHOG was first proposed and successfully used in object classification and pattern recognition by Bosch et al. in32. We create a quad tree of the original image with the original image at the root as follows. Each node of the quad tree has four children, namely, top-left, top-right, bottom-left and bottom right. Each of these images is of size one fourth of the original image. In Fig. 3, we have shown a quad tree up to level 1. In our experiments, we have taken the quad tree up to level 3 and have achieved excellent results. For quad tree up to level 3, there are 1 + 4 + 4 × 4 + 4 × 4 × 4=85 nodes. For each node, we create gradient orientation histogram with 9 bins with a bin size of 40 degree. Now, we have 85 × 9 = 765 features. We incorporate these 765 features to a vector of size 765. Then, we normalize the vector by dividing it with the sum of its 765 components. This is our PHOG feature vector. Now, PHOG combined with CoMOGrad gives a total of 256 + 765 = 1021 features.


CoMOGrad and PHOG: From Computer Vision to Fast and Accurate Protein Tertiary Structure Retrieval.

Karim R, Aziz MM, Shatabda S, Rahman MS, Mia MA, Zaman F, Rakin S - Sci Rep (2015)

Level 1 quad tree of α carbon distance matrix image.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4543952&req=5

f3: Level 1 quad tree of α carbon distance matrix image.
Mentions: The use of CoMOGrad gives us an ultra fast structure retrieval algorithm. However it achieves this speed at the cost of some reduction in accuracy. From the discussion in the previous sections, it is clear that we have to incorporate the gradient magnitude and spatial orientation of gradient along with the angular orientation of gradient to accurately capture the tertiary structure of a protein. The CoMOGrad feature only includes the angular orientation of gradient and the co-occurrence of angular orientation of gradient. To incorporate the gradient magnitude and the spatial orientation of gradient along with the angular orientation of gradient, we extract another feature named pyramid histogram of oriented gradient (PHOG) together with our CoMOGrad feature to improve the accuracy. PHOG was first proposed and successfully used in object classification and pattern recognition by Bosch et al. in32. We create a quad tree of the original image with the original image at the root as follows. Each node of the quad tree has four children, namely, top-left, top-right, bottom-left and bottom right. Each of these images is of size one fourth of the original image. In Fig. 3, we have shown a quad tree up to level 1. In our experiments, we have taken the quad tree up to level 3 and have achieved excellent results. For quad tree up to level 3, there are 1 + 4 + 4 × 4 + 4 × 4 × 4=85 nodes. For each node, we create gradient orientation histogram with 9 bins with a bin size of 40 degree. Now, we have 85 × 9 = 765 features. We incorporate these 765 features to a vector of size 765. Then, we normalize the vector by dividing it with the sum of its 765 components. This is our PHOG feature vector. Now, PHOG combined with CoMOGrad gives a total of 256 + 765 = 1021 features.

Bottom Line: Our proposed methods borrow ideas from the field of computer vision.The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure.Experimental results clearly indicate the superiority of our approach in both running time and accuracy.

View Article: PubMed Central - PubMed

Affiliation: AlEDA Group, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Bangladesh.

ABSTRACT
The number of entries in a structural database of proteins is increasing day by day. Methods for retrieving protein tertiary structures from such a large database have turn out to be the key to comparative analysis of structures that plays an important role to understand proteins and their functions. In this paper, we present fast and accurate methods for the retrieval of proteins having tertiary structures similar to a query protein from a large database. Our proposed methods borrow ideas from the field of computer vision. The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure. Experimental results clearly indicate the superiority of our approach in both running time and accuracy. Our method is readily available for use from this website: http://research.buet.ac.bd:8080/Comograd/.

No MeSH data available.