Limits...
Computational prediction of O-linked glycosylation sites that preferentially map on intrinsically disordered regions of extracellular proteins.

Nishikawa I, Nakajima Y, Ito M, Fukuchi S, Homma K, Nishikawa K - Int J Mol Sci (2010)

Bottom Line: O-glycosylated sites were often found clustered along the sequence, whereas other sites were located sporadically.The O-glycosylation sites were preferentially located within intrinsically disordered regions of extracellular proteins: particularly, more than 90% of the clustered O-GalNAc glycosylation sites were observed in intrinsically disordered regions.This feature could be the key for understanding the non-conservation property of O-glycosylation, and its role in functional diversity and structural stability.

View Article: PubMed Central - PubMed

Affiliation: College of Information Science and Engineering, Ritsumeikan University/Noji-higashi 1-1-1, Kusatsu, Shiga 525-8577, Japan; E-Mail: nakajima.yukiko@gmail.com.

ABSTRACT
O-glycosylation of mammalian proteins is one of the important posttranslational modifications. We applied a support vector machine (SVM) to predict whether Ser or Thr is glycosylated, in order to elucidate the O-glycosylation mechanism. O-glycosylated sites were often found clustered along the sequence, whereas other sites were located sporadically. Therefore, we developed two types of SVMs for predicting clustered and isolated sites separately. We found that the amino acid composition was effective for predicting the clustered type, whereas the site-specific algorithm was effective for the isolated type. The highest prediction accuracy for the clustered type was 74%, while that for the isolated type was 79%. The existence frequency of amino acids around the O-glycosylation sites was different in the two types: namely, Pro, Val and Ala had high existence probabilities at each specific position relative to a glycosylation site, especially for the isolated type. Independent component analyses for the amino acid sequences around O-glycosylation sites showed the position-specific existences of the identified amino acids as independent components. The O-glycosylation sites were preferentially located within intrinsically disordered regions of extracellular proteins: particularly, more than 90% of the clustered O-GalNAc glycosylation sites were observed in intrinsically disordered regions. This feature could be the key for understanding the non-conservation property of O-glycosylation, and its role in functional diversity and structural stability.

Show MeSH
Glycosylation sites plotted along with the distinction between structural domains and ID regions of human glycoproteins. The light blue and red regions correspond to structural domains and ID regions, respectively, and the blue and orange dots indicate mucin-type O-linked (GalNAc) and N-linked sites, respectively. (a) FA12_HUMAN: coagulation factor XII with O-linked (GalNAc) modifications at T299, T305, S308, T328, T329 and T337, and N-linked (GlcNAc) modifications at N249 and N433. (b) GLPA_HUMAN: glycophorin-A with O-linked sites at S21, T22, T23, T29, S30, T31, S32, T36, S38, S41, T44, T52, T56, S63, S66 and T69, and N-linked site at N45. (c) IC1_HUMAN: plasma protease C1 inhibitor with O-linked sites at T48, S64, T71, T83, T88, T92 and T96, and N-linked sites at N25, N69, N81, N238, N253, N272 and N352. (d) IGHA1_HUMAN: Ig α-1 chain C region with O-linked sites at S105, S111, S113, S119 and S121, and N-linked sites at N144 and N340.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3100847&req=5

f5-ijms-11-04991: Glycosylation sites plotted along with the distinction between structural domains and ID regions of human glycoproteins. The light blue and red regions correspond to structural domains and ID regions, respectively, and the blue and orange dots indicate mucin-type O-linked (GalNAc) and N-linked sites, respectively. (a) FA12_HUMAN: coagulation factor XII with O-linked (GalNAc) modifications at T299, T305, S308, T328, T329 and T337, and N-linked (GlcNAc) modifications at N249 and N433. (b) GLPA_HUMAN: glycophorin-A with O-linked sites at S21, T22, T23, T29, S30, T31, S32, T36, S38, S41, T44, T52, T56, S63, S66 and T69, and N-linked site at N45. (c) IC1_HUMAN: plasma protease C1 inhibitor with O-linked sites at T48, S64, T71, T83, T88, T92 and T96, and N-linked sites at N25, N69, N81, N238, N253, N272 and N352. (d) IGHA1_HUMAN: Ig α-1 chain C region with O-linked sites at S105, S111, S113, S119 and S121, and N-linked sites at N144 and N340.

Mentions: Figure 5 shows examples of mucin-type O-glycoproteins. Six sites of coagulation factor XII (UniProt ID: FA12_HUMAN) of secreted protein [42] were modified by mucin-type O-glycosylation. In addition, glycophorin-A (UniProt ID: GLPA_HUMAN) of cell membrane protein [43,44], plasma protease C1 inhibitor (UniProt ID: IC1_HUMAN) of secreted protein [45], and Ig α-1 chain C region (UniProt ID: IGHA1_HUMAN) of immunoglobulin were O-glycosylated at 16, seven, and five sites, respectively. The results of 62 human proteins are shown in Supplemental Figure S3.


Computational prediction of O-linked glycosylation sites that preferentially map on intrinsically disordered regions of extracellular proteins.

Nishikawa I, Nakajima Y, Ito M, Fukuchi S, Homma K, Nishikawa K - Int J Mol Sci (2010)

Glycosylation sites plotted along with the distinction between structural domains and ID regions of human glycoproteins. The light blue and red regions correspond to structural domains and ID regions, respectively, and the blue and orange dots indicate mucin-type O-linked (GalNAc) and N-linked sites, respectively. (a) FA12_HUMAN: coagulation factor XII with O-linked (GalNAc) modifications at T299, T305, S308, T328, T329 and T337, and N-linked (GlcNAc) modifications at N249 and N433. (b) GLPA_HUMAN: glycophorin-A with O-linked sites at S21, T22, T23, T29, S30, T31, S32, T36, S38, S41, T44, T52, T56, S63, S66 and T69, and N-linked site at N45. (c) IC1_HUMAN: plasma protease C1 inhibitor with O-linked sites at T48, S64, T71, T83, T88, T92 and T96, and N-linked sites at N25, N69, N81, N238, N253, N272 and N352. (d) IGHA1_HUMAN: Ig α-1 chain C region with O-linked sites at S105, S111, S113, S119 and S121, and N-linked sites at N144 and N340.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3100847&req=5

f5-ijms-11-04991: Glycosylation sites plotted along with the distinction between structural domains and ID regions of human glycoproteins. The light blue and red regions correspond to structural domains and ID regions, respectively, and the blue and orange dots indicate mucin-type O-linked (GalNAc) and N-linked sites, respectively. (a) FA12_HUMAN: coagulation factor XII with O-linked (GalNAc) modifications at T299, T305, S308, T328, T329 and T337, and N-linked (GlcNAc) modifications at N249 and N433. (b) GLPA_HUMAN: glycophorin-A with O-linked sites at S21, T22, T23, T29, S30, T31, S32, T36, S38, S41, T44, T52, T56, S63, S66 and T69, and N-linked site at N45. (c) IC1_HUMAN: plasma protease C1 inhibitor with O-linked sites at T48, S64, T71, T83, T88, T92 and T96, and N-linked sites at N25, N69, N81, N238, N253, N272 and N352. (d) IGHA1_HUMAN: Ig α-1 chain C region with O-linked sites at S105, S111, S113, S119 and S121, and N-linked sites at N144 and N340.
Mentions: Figure 5 shows examples of mucin-type O-glycoproteins. Six sites of coagulation factor XII (UniProt ID: FA12_HUMAN) of secreted protein [42] were modified by mucin-type O-glycosylation. In addition, glycophorin-A (UniProt ID: GLPA_HUMAN) of cell membrane protein [43,44], plasma protease C1 inhibitor (UniProt ID: IC1_HUMAN) of secreted protein [45], and Ig α-1 chain C region (UniProt ID: IGHA1_HUMAN) of immunoglobulin were O-glycosylated at 16, seven, and five sites, respectively. The results of 62 human proteins are shown in Supplemental Figure S3.

Bottom Line: O-glycosylated sites were often found clustered along the sequence, whereas other sites were located sporadically.The O-glycosylation sites were preferentially located within intrinsically disordered regions of extracellular proteins: particularly, more than 90% of the clustered O-GalNAc glycosylation sites were observed in intrinsically disordered regions.This feature could be the key for understanding the non-conservation property of O-glycosylation, and its role in functional diversity and structural stability.

View Article: PubMed Central - PubMed

Affiliation: College of Information Science and Engineering, Ritsumeikan University/Noji-higashi 1-1-1, Kusatsu, Shiga 525-8577, Japan; E-Mail: nakajima.yukiko@gmail.com.

ABSTRACT
O-glycosylation of mammalian proteins is one of the important posttranslational modifications. We applied a support vector machine (SVM) to predict whether Ser or Thr is glycosylated, in order to elucidate the O-glycosylation mechanism. O-glycosylated sites were often found clustered along the sequence, whereas other sites were located sporadically. Therefore, we developed two types of SVMs for predicting clustered and isolated sites separately. We found that the amino acid composition was effective for predicting the clustered type, whereas the site-specific algorithm was effective for the isolated type. The highest prediction accuracy for the clustered type was 74%, while that for the isolated type was 79%. The existence frequency of amino acids around the O-glycosylation sites was different in the two types: namely, Pro, Val and Ala had high existence probabilities at each specific position relative to a glycosylation site, especially for the isolated type. Independent component analyses for the amino acid sequences around O-glycosylation sites showed the position-specific existences of the identified amino acids as independent components. The O-glycosylation sites were preferentially located within intrinsically disordered regions of extracellular proteins: particularly, more than 90% of the clustered O-GalNAc glycosylation sites were observed in intrinsically disordered regions. This feature could be the key for understanding the non-conservation property of O-glycosylation, and its role in functional diversity and structural stability.

Show MeSH