Limits...
Collagen cross-linking: insights on the evolution of metazoan extracellular matrix

View Article: PubMed Central - PubMed

ABSTRACT

Collagens constitute a large family of extracellular matrix (ECM) proteins that play a fundamental role in supporting the structure of various tissues in multicellular animals. The mechanical strength of fibrillar collagens is highly dependent on the formation of covalent cross-links between individual fibrils, a process initiated by the enzymatic action of members of the lysyl oxidase (LOX) family. Fibrillar collagens are present in a wide variety of animals, therefore often being associated with metazoan evolution, where the emergence of an ancestral collagen chain has been proposed to lead to the formation of different clades. While LOX-generated collagen cross-linking metabolites have been detected in different metazoan families, there is limited information about when and how collagen acquired this particular modification. By analyzing telopeptide and helical sequences, we identified highly conserved, potential cross-linking sites throughout the metazoan tree of life. Based on this analysis, we propose that they have importantly contributed to the formation and further expansion of fibrillar collagens.

No MeSH data available.


Sequence alignment of N-telopeptide cross-linking sites.(A) Metazoan collagen chains used in Fig. 3 were aligned at the N-telopeptide sequences to search for potentially conserved cross-linking sites. Left panel shows the location of the aligned segments, with the N-propeptides marked in red. Right part shows the alignment, which is delimited by the position of the short triple helix and the beginning of the central helical segment. Sequences omitted from the C-propeptide analysis were either incomplete (lancelet, polychaete, lugworm, hemychordate) or lack of significant homology (abalone-α-chain b). (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the N-telopeptides collagen chains.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5120351&req=5

f2: Sequence alignment of N-telopeptide cross-linking sites.(A) Metazoan collagen chains used in Fig. 3 were aligned at the N-telopeptide sequences to search for potentially conserved cross-linking sites. Left panel shows the location of the aligned segments, with the N-propeptides marked in red. Right part shows the alignment, which is delimited by the position of the short triple helix and the beginning of the central helical segment. Sequences omitted from the C-propeptide analysis were either incomplete (lancelet, polychaete, lugworm, hemychordate) or lack of significant homology (abalone-α-chain b). (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the N-telopeptides collagen chains.

Mentions: In order to answer the question as to what extent potential sites for LOX-mediated cross-linking are present and conserved among metazoan lineages, we inspected the sequences of fibrillar collagens involved in the cross-linking reaction, namely the N- and C-telopeptides, and the corresponding C- and N-helical segments. C-propeptide sequences are the most conserved region of the fibrillar collagens, and the pattern of conservation of cross-linking sites can be analyzed by looking at the sequences downstream of the (GPP)n repeats, the motif marking the C-terminus of the triple helix23. We searched for sequence similarities for the human α1 (I) or α1 (II) C-telopeptide cross-linking sites: QEKAH and REKGP (K being the lysine providing the ε-amino group), respectively, in several metazoan clade A collagens as homology within this clade had been previously reported8. Multiple sequence alignment shows the presence of homologous sequences among the groups analyzed, with the pattern XKX′X″, where X is any residue, K the lysine involved in the reaction, X′ is glycine (mostly) or alanine (both small, non-polar aminoacids), and X″ is proline in most of cases, as shown in the weighted logo (Fig. 1A and B). N-propeptide sequences are the most variable within collagen families, including, in addition to the N-telopeptide, a cysteine-rich repeat, the von Willebrand factor-type C (VWC) module, a thrombospondin N-terminal -like domain (TSPN), or, as in some invertebrates α chains, a whey acidic protein (WAP) or von Willebrand factor A domain (VWA) modules, among others24. In most cases, the presence of a short triple helix marks the beginning of the N-telopeptide. Sequence comparison shows that a certain degree of homology was also observed around the cross-linking site (Fig. 2A and B). As for C-telopeptide sequences, a significant number of species display the pattern XKX′X″, with little variations in X′ and X″, such as in human, abalone and hydra. In addition to local sequence homology around the cross-linking telopeptide lysines, none of the collagens illustrated have any lysine residue between the C-terminal ends of the short N-terminal or main helices onto the cross-linking telopeptide lysines. Given a lysine occurrence of 7.2% in proteins, this has a probability of 1.2 × 10−9 (0.928275) of occurring randomly across all 19 C-terminal telopeptide sequences, and a probability of 1.9 × 10−6 (0.928176) for the 14 sequences between the (latterly removed) N-terminal helix and the cross-link, and is therefore a conserved feature. The regions between the C-terminal cross-link lysine onto the end of the molecule, and between the N-terminal cross-link lysine onto the main helix are also very lysine-poor, with no other lysine within five residues of the cross-linking one and the majority of sequences having no other lysine at all.


Collagen cross-linking: insights on the evolution of metazoan extracellular matrix
Sequence alignment of N-telopeptide cross-linking sites.(A) Metazoan collagen chains used in Fig. 3 were aligned at the N-telopeptide sequences to search for potentially conserved cross-linking sites. Left panel shows the location of the aligned segments, with the N-propeptides marked in red. Right part shows the alignment, which is delimited by the position of the short triple helix and the beginning of the central helical segment. Sequences omitted from the C-propeptide analysis were either incomplete (lancelet, polychaete, lugworm, hemychordate) or lack of significant homology (abalone-α-chain b). (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the N-telopeptides collagen chains.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5120351&req=5

f2: Sequence alignment of N-telopeptide cross-linking sites.(A) Metazoan collagen chains used in Fig. 3 were aligned at the N-telopeptide sequences to search for potentially conserved cross-linking sites. Left panel shows the location of the aligned segments, with the N-propeptides marked in red. Right part shows the alignment, which is delimited by the position of the short triple helix and the beginning of the central helical segment. Sequences omitted from the C-propeptide analysis were either incomplete (lancelet, polychaete, lugworm, hemychordate) or lack of significant homology (abalone-α-chain b). (B) Sequence logo of the position weight matrix of the putative cross-linking sites analyzed in the N-telopeptides collagen chains.
Mentions: In order to answer the question as to what extent potential sites for LOX-mediated cross-linking are present and conserved among metazoan lineages, we inspected the sequences of fibrillar collagens involved in the cross-linking reaction, namely the N- and C-telopeptides, and the corresponding C- and N-helical segments. C-propeptide sequences are the most conserved region of the fibrillar collagens, and the pattern of conservation of cross-linking sites can be analyzed by looking at the sequences downstream of the (GPP)n repeats, the motif marking the C-terminus of the triple helix23. We searched for sequence similarities for the human α1 (I) or α1 (II) C-telopeptide cross-linking sites: QEKAH and REKGP (K being the lysine providing the ε-amino group), respectively, in several metazoan clade A collagens as homology within this clade had been previously reported8. Multiple sequence alignment shows the presence of homologous sequences among the groups analyzed, with the pattern XKX′X″, where X is any residue, K the lysine involved in the reaction, X′ is glycine (mostly) or alanine (both small, non-polar aminoacids), and X″ is proline in most of cases, as shown in the weighted logo (Fig. 1A and B). N-propeptide sequences are the most variable within collagen families, including, in addition to the N-telopeptide, a cysteine-rich repeat, the von Willebrand factor-type C (VWC) module, a thrombospondin N-terminal -like domain (TSPN), or, as in some invertebrates α chains, a whey acidic protein (WAP) or von Willebrand factor A domain (VWA) modules, among others24. In most cases, the presence of a short triple helix marks the beginning of the N-telopeptide. Sequence comparison shows that a certain degree of homology was also observed around the cross-linking site (Fig. 2A and B). As for C-telopeptide sequences, a significant number of species display the pattern XKX′X″, with little variations in X′ and X″, such as in human, abalone and hydra. In addition to local sequence homology around the cross-linking telopeptide lysines, none of the collagens illustrated have any lysine residue between the C-terminal ends of the short N-terminal or main helices onto the cross-linking telopeptide lysines. Given a lysine occurrence of 7.2% in proteins, this has a probability of 1.2 × 10−9 (0.928275) of occurring randomly across all 19 C-terminal telopeptide sequences, and a probability of 1.9 × 10−6 (0.928176) for the 14 sequences between the (latterly removed) N-terminal helix and the cross-link, and is therefore a conserved feature. The regions between the C-terminal cross-link lysine onto the end of the molecule, and between the N-terminal cross-link lysine onto the main helix are also very lysine-poor, with no other lysine within five residues of the cross-linking one and the majority of sequences having no other lysine at all.

View Article: PubMed Central - PubMed

ABSTRACT

Collagens constitute a large family of extracellular matrix (ECM) proteins that play a fundamental role in supporting the structure of various tissues in multicellular animals. The mechanical strength of fibrillar collagens is highly dependent on the formation of covalent cross-links between individual fibrils, a process initiated by the enzymatic action of members of the lysyl oxidase (LOX) family. Fibrillar collagens are present in a wide variety of animals, therefore often being associated with metazoan evolution, where the emergence of an ancestral collagen chain has been proposed to lead to the formation of different clades. While LOX-generated collagen cross-linking metabolites have been detected in different metazoan families, there is limited information about when and how collagen acquired this particular modification. By analyzing telopeptide and helical sequences, we identified highly conserved, potential cross-linking sites throughout the metazoan tree of life. Based on this analysis, we propose that they have importantly contributed to the formation and further expansion of fibrillar collagens.

No MeSH data available.