Limits...
Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5).

Aspeborg H, Coutinho PM, Wang Y, Brumer H, Henrissat B - BMC Evol. Biol. (2012)

Bottom Line: About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data.Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future.Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Glycoscience, School of Biotechnology, KTH - Royal Institute of Technology, AlbaNova University Center, Stockholm SE-106 91, Sweden.

ABSTRACT

Background: The large Glycoside Hydrolase family 5 (GH5) groups together a wide range of enzymes acting on β-linked oligo- and polysaccharides, and glycoconjugates from a large spectrum of organisms. The long and complex evolution of this family of enzymes and its broad sequence diversity limits functional prediction. With the objective of improving the differentiation of enzyme specificities in a knowledge-based context, and to obtain new evolutionary insights, we present here a new, robust subfamily classification of family GH5.

Results: About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data. Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future. Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data. Mapping of functional knowledge onto the GH5 phylogenetic tree revealed that the sequence space of this historical and industrially important family is far from well dispersed, highlighting targets in need of further study. The analysis also uncovered a number of GH5 proteins which have lost their catalytic machinery, indicating evolution towards novel functions.

Conclusion: Overall, the subfamily division of GH5 provides an actively curated resource for large-scale protein sequence annotation for glycogenomics; the subfamily assignments are openly accessible via the Carbohydrate-Active Enzyme database at http://www.cazy.org/GH5.html.

Show MeSH
Phylogenetic tree of family GH5. In this circular phylogram, the branches corresponding to subfamilies 1–53 are shown in color and the subfamily numbers are indicated next to the exterior color circle. The branches corresponding to sequences not included into subfamilies are in black. A detailed version of this tree is found in Additional file1: Figure S1.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3526467&req=5

Figure 1: Phylogenetic tree of family GH5. In this circular phylogram, the branches corresponding to subfamilies 1–53 are shown in color and the subfamily numbers are indicated next to the exterior color circle. The branches corresponding to sequences not included into subfamilies are in black. A detailed version of this tree is found in Additional file1: Figure S1.

Mentions: Our bioinformatics approach allowed the division of close to 2300 GH5 catalytic modules into 51 distinct subfamilies, as shown in the global phylogenetic tree (Figure1 and Additional file1: Figure S1); subfamily information is summarized in Table1. Subfamily naming follows the procedure devised for GH13, where the family number is followed by an Arabic numeral that reflects the order of creation[24]: GH5_1 to GH5_53. This series is essentially continuous, with a few exceptions due to historical reasons: All of the previously described subfamilies (A1-A10) have been re-identified in the current investigation except for A3 and A4, which are merged into a single subfamily GH5_4 and A5 and A6 which are unified in subfamily GH5_5 (Figure1). To maintain consistency with earlier literature, the re-identified historical subfamilies have retained the original Arabic numeral. For example, the subfamily formerly known as A2 is hereby designated GH5_2. The absence of subfamilies GH5_3 and GH5_6 reflects the two fusion events involving the historical subfamilies described above[19].


Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5).

Aspeborg H, Coutinho PM, Wang Y, Brumer H, Henrissat B - BMC Evol. Biol. (2012)

Phylogenetic tree of family GH5. In this circular phylogram, the branches corresponding to subfamilies 1–53 are shown in color and the subfamily numbers are indicated next to the exterior color circle. The branches corresponding to sequences not included into subfamilies are in black. A detailed version of this tree is found in Additional file1: Figure S1.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3526467&req=5

Figure 1: Phylogenetic tree of family GH5. In this circular phylogram, the branches corresponding to subfamilies 1–53 are shown in color and the subfamily numbers are indicated next to the exterior color circle. The branches corresponding to sequences not included into subfamilies are in black. A detailed version of this tree is found in Additional file1: Figure S1.
Mentions: Our bioinformatics approach allowed the division of close to 2300 GH5 catalytic modules into 51 distinct subfamilies, as shown in the global phylogenetic tree (Figure1 and Additional file1: Figure S1); subfamily information is summarized in Table1. Subfamily naming follows the procedure devised for GH13, where the family number is followed by an Arabic numeral that reflects the order of creation[24]: GH5_1 to GH5_53. This series is essentially continuous, with a few exceptions due to historical reasons: All of the previously described subfamilies (A1-A10) have been re-identified in the current investigation except for A3 and A4, which are merged into a single subfamily GH5_4 and A5 and A6 which are unified in subfamily GH5_5 (Figure1). To maintain consistency with earlier literature, the re-identified historical subfamilies have retained the original Arabic numeral. For example, the subfamily formerly known as A2 is hereby designated GH5_2. The absence of subfamilies GH5_3 and GH5_6 reflects the two fusion events involving the historical subfamilies described above[19].

Bottom Line: About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data.Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future.Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Glycoscience, School of Biotechnology, KTH - Royal Institute of Technology, AlbaNova University Center, Stockholm SE-106 91, Sweden.

ABSTRACT

Background: The large Glycoside Hydrolase family 5 (GH5) groups together a wide range of enzymes acting on β-linked oligo- and polysaccharides, and glycoconjugates from a large spectrum of organisms. The long and complex evolution of this family of enzymes and its broad sequence diversity limits functional prediction. With the objective of improving the differentiation of enzyme specificities in a knowledge-based context, and to obtain new evolutionary insights, we present here a new, robust subfamily classification of family GH5.

Results: About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data. Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future. Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data. Mapping of functional knowledge onto the GH5 phylogenetic tree revealed that the sequence space of this historical and industrially important family is far from well dispersed, highlighting targets in need of further study. The analysis also uncovered a number of GH5 proteins which have lost their catalytic machinery, indicating evolution towards novel functions.

Conclusion: Overall, the subfamily division of GH5 provides an actively curated resource for large-scale protein sequence annotation for glycogenomics; the subfamily assignments are openly accessible via the Carbohydrate-Active Enzyme database at http://www.cazy.org/GH5.html.

Show MeSH