Limits...
Collagen cross-linking: insights on the evolution of metazoan extracellular matrix

View Article: PubMed Central - PubMed

ABSTRACT

Collagens constitute a large family of extracellular matrix (ECM) proteins that play a fundamental role in supporting the structure of various tissues in multicellular animals. The mechanical strength of fibrillar collagens is highly dependent on the formation of covalent cross-links between individual fibrils, a process initiated by the enzymatic action of members of the lysyl oxidase (LOX) family. Fibrillar collagens are present in a wide variety of animals, therefore often being associated with metazoan evolution, where the emergence of an ancestral collagen chain has been proposed to lead to the formation of different clades. While LOX-generated collagen cross-linking metabolites have been detected in different metazoan families, there is limited information about when and how collagen acquired this particular modification. By analyzing telopeptide and helical sequences, we identified highly conserved, potential cross-linking sites throughout the metazoan tree of life. Based on this analysis, we propose that they have importantly contributed to the formation and further expansion of fibrillar collagens.

No MeSH data available.


Multiple sequence alignment of potential C-telopeptide cross-linking sites among several metazoan collagens.(A) A group of 19 clade A (or A-like) collagens representing different metazoan lineages were aligned using the ClustalW algorithm and potential cross-linking sites within the C-telopeptide sequences were manually checked in the alignment47. Left panel shows the location of the aligned segments within the analyzed chains, with the fibrillar collagen C-terminal domain (COLFI, pfam01410) marked in red. Right part shows the alignment, which displays the degree of identity on a gray scale. The position of the GPP triplets, the known motif marking the C-terminal end of the helix region, and the potential cross-linking sites are indicated above the alignment. (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the C-telopeptides collagen chains obtained using WebLogo48.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5120351&req=5

f1: Multiple sequence alignment of potential C-telopeptide cross-linking sites among several metazoan collagens.(A) A group of 19 clade A (or A-like) collagens representing different metazoan lineages were aligned using the ClustalW algorithm and potential cross-linking sites within the C-telopeptide sequences were manually checked in the alignment47. Left panel shows the location of the aligned segments within the analyzed chains, with the fibrillar collagen C-terminal domain (COLFI, pfam01410) marked in red. Right part shows the alignment, which displays the degree of identity on a gray scale. The position of the GPP triplets, the known motif marking the C-terminal end of the helix region, and the potential cross-linking sites are indicated above the alignment. (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the C-telopeptides collagen chains obtained using WebLogo48.

Mentions: In order to answer the question as to what extent potential sites for LOX-mediated cross-linking are present and conserved among metazoan lineages, we inspected the sequences of fibrillar collagens involved in the cross-linking reaction, namely the N- and C-telopeptides, and the corresponding C- and N-helical segments. C-propeptide sequences are the most conserved region of the fibrillar collagens, and the pattern of conservation of cross-linking sites can be analyzed by looking at the sequences downstream of the (GPP)n repeats, the motif marking the C-terminus of the triple helix23. We searched for sequence similarities for the human α1 (I) or α1 (II) C-telopeptide cross-linking sites: QEKAH and REKGP (K being the lysine providing the ε-amino group), respectively, in several metazoan clade A collagens as homology within this clade had been previously reported8. Multiple sequence alignment shows the presence of homologous sequences among the groups analyzed, with the pattern XKX′X″, where X is any residue, K the lysine involved in the reaction, X′ is glycine (mostly) or alanine (both small, non-polar aminoacids), and X″ is proline in most of cases, as shown in the weighted logo (Fig. 1A and B). N-propeptide sequences are the most variable within collagen families, including, in addition to the N-telopeptide, a cysteine-rich repeat, the von Willebrand factor-type C (VWC) module, a thrombospondin N-terminal -like domain (TSPN), or, as in some invertebrates α chains, a whey acidic protein (WAP) or von Willebrand factor A domain (VWA) modules, among others24. In most cases, the presence of a short triple helix marks the beginning of the N-telopeptide. Sequence comparison shows that a certain degree of homology was also observed around the cross-linking site (Fig. 2A and B). As for C-telopeptide sequences, a significant number of species display the pattern XKX′X″, with little variations in X′ and X″, such as in human, abalone and hydra. In addition to local sequence homology around the cross-linking telopeptide lysines, none of the collagens illustrated have any lysine residue between the C-terminal ends of the short N-terminal or main helices onto the cross-linking telopeptide lysines. Given a lysine occurrence of 7.2% in proteins, this has a probability of 1.2 × 10−9 (0.928275) of occurring randomly across all 19 C-terminal telopeptide sequences, and a probability of 1.9 × 10−6 (0.928176) for the 14 sequences between the (latterly removed) N-terminal helix and the cross-link, and is therefore a conserved feature. The regions between the C-terminal cross-link lysine onto the end of the molecule, and between the N-terminal cross-link lysine onto the main helix are also very lysine-poor, with no other lysine within five residues of the cross-linking one and the majority of sequences having no other lysine at all.


Collagen cross-linking: insights on the evolution of metazoan extracellular matrix
Multiple sequence alignment of potential C-telopeptide cross-linking sites among several metazoan collagens.(A) A group of 19 clade A (or A-like) collagens representing different metazoan lineages were aligned using the ClustalW algorithm and potential cross-linking sites within the C-telopeptide sequences were manually checked in the alignment47. Left panel shows the location of the aligned segments within the analyzed chains, with the fibrillar collagen C-terminal domain (COLFI, pfam01410) marked in red. Right part shows the alignment, which displays the degree of identity on a gray scale. The position of the GPP triplets, the known motif marking the C-terminal end of the helix region, and the potential cross-linking sites are indicated above the alignment. (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the C-telopeptides collagen chains obtained using WebLogo48.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5120351&req=5

f1: Multiple sequence alignment of potential C-telopeptide cross-linking sites among several metazoan collagens.(A) A group of 19 clade A (or A-like) collagens representing different metazoan lineages were aligned using the ClustalW algorithm and potential cross-linking sites within the C-telopeptide sequences were manually checked in the alignment47. Left panel shows the location of the aligned segments within the analyzed chains, with the fibrillar collagen C-terminal domain (COLFI, pfam01410) marked in red. Right part shows the alignment, which displays the degree of identity on a gray scale. The position of the GPP triplets, the known motif marking the C-terminal end of the helix region, and the potential cross-linking sites are indicated above the alignment. (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the C-telopeptides collagen chains obtained using WebLogo48.
Mentions: In order to answer the question as to what extent potential sites for LOX-mediated cross-linking are present and conserved among metazoan lineages, we inspected the sequences of fibrillar collagens involved in the cross-linking reaction, namely the N- and C-telopeptides, and the corresponding C- and N-helical segments. C-propeptide sequences are the most conserved region of the fibrillar collagens, and the pattern of conservation of cross-linking sites can be analyzed by looking at the sequences downstream of the (GPP)n repeats, the motif marking the C-terminus of the triple helix23. We searched for sequence similarities for the human α1 (I) or α1 (II) C-telopeptide cross-linking sites: QEKAH and REKGP (K being the lysine providing the ε-amino group), respectively, in several metazoan clade A collagens as homology within this clade had been previously reported8. Multiple sequence alignment shows the presence of homologous sequences among the groups analyzed, with the pattern XKX′X″, where X is any residue, K the lysine involved in the reaction, X′ is glycine (mostly) or alanine (both small, non-polar aminoacids), and X″ is proline in most of cases, as shown in the weighted logo (Fig. 1A and B). N-propeptide sequences are the most variable within collagen families, including, in addition to the N-telopeptide, a cysteine-rich repeat, the von Willebrand factor-type C (VWC) module, a thrombospondin N-terminal -like domain (TSPN), or, as in some invertebrates α chains, a whey acidic protein (WAP) or von Willebrand factor A domain (VWA) modules, among others24. In most cases, the presence of a short triple helix marks the beginning of the N-telopeptide. Sequence comparison shows that a certain degree of homology was also observed around the cross-linking site (Fig. 2A and B). As for C-telopeptide sequences, a significant number of species display the pattern XKX′X″, with little variations in X′ and X″, such as in human, abalone and hydra. In addition to local sequence homology around the cross-linking telopeptide lysines, none of the collagens illustrated have any lysine residue between the C-terminal ends of the short N-terminal or main helices onto the cross-linking telopeptide lysines. Given a lysine occurrence of 7.2% in proteins, this has a probability of 1.2 × 10−9 (0.928275) of occurring randomly across all 19 C-terminal telopeptide sequences, and a probability of 1.9 × 10−6 (0.928176) for the 14 sequences between the (latterly removed) N-terminal helix and the cross-link, and is therefore a conserved feature. The regions between the C-terminal cross-link lysine onto the end of the molecule, and between the N-terminal cross-link lysine onto the main helix are also very lysine-poor, with no other lysine within five residues of the cross-linking one and the majority of sequences having no other lysine at all.

View Article: PubMed Central - PubMed

ABSTRACT

Collagens constitute a large family of extracellular matrix (ECM) proteins that play a fundamental role in supporting the structure of various tissues in multicellular animals. The mechanical strength of fibrillar collagens is highly dependent on the formation of covalent cross-links between individual fibrils, a process initiated by the enzymatic action of members of the lysyl oxidase (LOX) family. Fibrillar collagens are present in a wide variety of animals, therefore often being associated with metazoan evolution, where the emergence of an ancestral collagen chain has been proposed to lead to the formation of different clades. While LOX-generated collagen cross-linking metabolites have been detected in different metazoan families, there is limited information about when and how collagen acquired this particular modification. By analyzing telopeptide and helical sequences, we identified highly conserved, potential cross-linking sites throughout the metazoan tree of life. Based on this analysis, we propose that they have importantly contributed to the formation and further expansion of fibrillar collagens.

No MeSH data available.