Limits...
A combinatorial approach to detect coevolved amino acid networks in protein families of variable divergence.

Baussand J, Carbone A - PLoS Comput. Biol. (2009)

Bottom Line: We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees.The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed.We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence.

View Article: PubMed Central - PubMed

Affiliation: Génomique Analytique, Université Pierre et Marie Curie, Paris, France.

ABSTRACT
Communication between distant sites often defines the biological role of a protein: amino acid long-range interactions are as important in binding specificity, allosteric regulation and conformational change as residues directly contacting the substrate. The maintaining of functional and structural coupling of long-range interacting residues requires coevolution of these residues. Networks of interaction between coevolved residues can be reconstructed, and from the networks, one can possibly derive insights into functional mechanisms for the protein family. We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees. The degree of coevolution of all pairs of coevolved residues is identified numerically, and networks are reconstructed with a dedicated clustering algorithm. The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed. We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence.

Show MeSH
Inner trees.A tree T (left) and “inner” trees  specific of residues A and E (1),  (2),  (3) at positions i and j respectively. Only residues at positions i and j of aligned sequences are taken into consideration. Branches of T labeled with A at position i and with E at position j are colored in blue and green respectively and determine the inner tree  (1). The inner trees  (2) and  (3) are determined in a similar way. Blue circles in ,  and  identify roots of MSTs associated to residue A at position i, and green circles identify roots of MSTs associated to residues E, D and F at position j.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2723916&req=5

pcbi-1000488-g005: Inner trees.A tree T (left) and “inner” trees specific of residues A and E (1), (2), (3) at positions i and j respectively. Only residues at positions i and j of aligned sequences are taken into consideration. Branches of T labeled with A at position i and with E at position j are colored in blue and green respectively and determine the inner tree (1). The inner trees (2) and (3) are determined in a similar way. Blue circles in , and identify roots of MSTs associated to residue A at position i, and green circles identify roots of MSTs associated to residues E, D and F at position j.

Mentions: Let be a residue at seed position i. For each pair of residues at seed positions , we consider the “inner” tree of T for which only the leaves of T which are labelled by the residue at position i or by at position j are considered (see examples in Figure 5). The inner tree is used to evaluate the overlap of MSTs associated to and . We denote the set of all MSTs associated to a residue at seed position i.


A combinatorial approach to detect coevolved amino acid networks in protein families of variable divergence.

Baussand J, Carbone A - PLoS Comput. Biol. (2009)

Inner trees.A tree T (left) and “inner” trees  specific of residues A and E (1),  (2),  (3) at positions i and j respectively. Only residues at positions i and j of aligned sequences are taken into consideration. Branches of T labeled with A at position i and with E at position j are colored in blue and green respectively and determine the inner tree  (1). The inner trees  (2) and  (3) are determined in a similar way. Blue circles in ,  and  identify roots of MSTs associated to residue A at position i, and green circles identify roots of MSTs associated to residues E, D and F at position j.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2723916&req=5

pcbi-1000488-g005: Inner trees.A tree T (left) and “inner” trees specific of residues A and E (1), (2), (3) at positions i and j respectively. Only residues at positions i and j of aligned sequences are taken into consideration. Branches of T labeled with A at position i and with E at position j are colored in blue and green respectively and determine the inner tree (1). The inner trees (2) and (3) are determined in a similar way. Blue circles in , and identify roots of MSTs associated to residue A at position i, and green circles identify roots of MSTs associated to residues E, D and F at position j.
Mentions: Let be a residue at seed position i. For each pair of residues at seed positions , we consider the “inner” tree of T for which only the leaves of T which are labelled by the residue at position i or by at position j are considered (see examples in Figure 5). The inner tree is used to evaluate the overlap of MSTs associated to and . We denote the set of all MSTs associated to a residue at seed position i.

Bottom Line: We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees.The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed.We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence.

View Article: PubMed Central - PubMed

Affiliation: Génomique Analytique, Université Pierre et Marie Curie, Paris, France.

ABSTRACT
Communication between distant sites often defines the biological role of a protein: amino acid long-range interactions are as important in binding specificity, allosteric regulation and conformational change as residues directly contacting the substrate. The maintaining of functional and structural coupling of long-range interacting residues requires coevolution of these residues. Networks of interaction between coevolved residues can be reconstructed, and from the networks, one can possibly derive insights into functional mechanisms for the protein family. We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees. The degree of coevolution of all pairs of coevolved residues is identified numerically, and networks are reconstructed with a dedicated clustering algorithm. The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed. We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence.

Show MeSH