Limits...
Identifying cognate binding pairs among a large set of paralogs: the case of PE/PPE proteins of Mycobacterium tuberculosis.

Riley R, Pellegrini M, Eisenberg D - PLoS Comput. Biol. (2008)

Bottom Line: Thirty-five of these predicted complexes were also found to have correlated mRNA expression, providing additional evidence for these interactions.We show that our method is applicable to other protein families, by analyzing interactions of the Esx family of proteins.Our resulting set of predictions is a starting point for genomewide experimental interaction screens of the PE and PPE families, and our method may be generally useful for detecting interactions of proteins within families having many paralogs.

View Article: PubMed Central - PubMed

Affiliation: Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America.

ABSTRACT
We consider the problem of how to detect cognate pairs of proteins that bind when each belongs to a large family of paralogs. To illustrate the problem, we have undertaken a genomewide analysis of interactions of members of the PE and PPE protein families of Mycobacterium tuberculosis. Our computational method uses structural information, operon organization, and protein coevolution to infer the interaction of PE and PPE proteins. Some 289 PE/PPE complexes were predicted out of a possible 5,590 PE/PPE pairs genomewide. Thirty-five of these predicted complexes were also found to have correlated mRNA expression, providing additional evidence for these interactions. We show that our method is applicable to other protein families, by analyzing interactions of the Esx family of proteins. Our resulting set of predictions is a starting point for genomewide experimental interaction screens of the PE and PPE families, and our method may be generally useful for detecting interactions of proteins within families having many paralogs.

Show MeSH

Related in: MedlinePlus

Overview of method for prediction of PE/PPE complexes.(A) PE/PPE operon pairs are identified. (B) Protein sequences of PE/PPE operon pairs are aligned to the known PE/PPE structure [13]. (C) Phylogenetic distance matrices for operon pairs are generated from the multiple alignments. (D) Coevolution of all genomewide PE/PPE pairs is evaluated by comparing distance vectors of length 14, consisting of the sequence distances between each protein and its 14 homologs in the PE or PPE reference matrix. (E) Coevolution correlations are further processed to generate predicted PE/PPE complexes.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2519833&req=5

pcbi-1000174-g001: Overview of method for prediction of PE/PPE complexes.(A) PE/PPE operon pairs are identified. (B) Protein sequences of PE/PPE operon pairs are aligned to the known PE/PPE structure [13]. (C) Phylogenetic distance matrices for operon pairs are generated from the multiple alignments. (D) Coevolution of all genomewide PE/PPE pairs is evaluated by comparing distance vectors of length 14, consisting of the sequence distances between each protein and its 14 homologs in the PE or PPE reference matrix. (E) Coevolution correlations are further processed to generate predicted PE/PPE complexes.

Mentions: We assumed that PE/PPE gene pairs adjacent on the genome, and in the same orientation, are in expression operons, as has been shown for Rv2431c/Rv2430c [13]. The components of protein complexes and metabolic pathways in prokaryotes are often located together on the genome in operons [19]. These operons are transcribed as a single, polycistronic mRNA. Genes located on an operon usually function together, and often form protein complexes. We predict thirteen other PE/PPE gene pairs lie in operons (Figure 1A) based on their short intergenic distance (<100 bp) and same transcription direction. These pairs have a high degree of coexpression (average mRNA correlation 0.59 for operon-paired, 0.05 for genomewide PE/PPE gene pairs, see Materials and methods), suggesting that these PE/PPE pairs are indeed in operons.


Identifying cognate binding pairs among a large set of paralogs: the case of PE/PPE proteins of Mycobacterium tuberculosis.

Riley R, Pellegrini M, Eisenberg D - PLoS Comput. Biol. (2008)

Overview of method for prediction of PE/PPE complexes.(A) PE/PPE operon pairs are identified. (B) Protein sequences of PE/PPE operon pairs are aligned to the known PE/PPE structure [13]. (C) Phylogenetic distance matrices for operon pairs are generated from the multiple alignments. (D) Coevolution of all genomewide PE/PPE pairs is evaluated by comparing distance vectors of length 14, consisting of the sequence distances between each protein and its 14 homologs in the PE or PPE reference matrix. (E) Coevolution correlations are further processed to generate predicted PE/PPE complexes.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2519833&req=5

pcbi-1000174-g001: Overview of method for prediction of PE/PPE complexes.(A) PE/PPE operon pairs are identified. (B) Protein sequences of PE/PPE operon pairs are aligned to the known PE/PPE structure [13]. (C) Phylogenetic distance matrices for operon pairs are generated from the multiple alignments. (D) Coevolution of all genomewide PE/PPE pairs is evaluated by comparing distance vectors of length 14, consisting of the sequence distances between each protein and its 14 homologs in the PE or PPE reference matrix. (E) Coevolution correlations are further processed to generate predicted PE/PPE complexes.
Mentions: We assumed that PE/PPE gene pairs adjacent on the genome, and in the same orientation, are in expression operons, as has been shown for Rv2431c/Rv2430c [13]. The components of protein complexes and metabolic pathways in prokaryotes are often located together on the genome in operons [19]. These operons are transcribed as a single, polycistronic mRNA. Genes located on an operon usually function together, and often form protein complexes. We predict thirteen other PE/PPE gene pairs lie in operons (Figure 1A) based on their short intergenic distance (<100 bp) and same transcription direction. These pairs have a high degree of coexpression (average mRNA correlation 0.59 for operon-paired, 0.05 for genomewide PE/PPE gene pairs, see Materials and methods), suggesting that these PE/PPE pairs are indeed in operons.

Bottom Line: Thirty-five of these predicted complexes were also found to have correlated mRNA expression, providing additional evidence for these interactions.We show that our method is applicable to other protein families, by analyzing interactions of the Esx family of proteins.Our resulting set of predictions is a starting point for genomewide experimental interaction screens of the PE and PPE families, and our method may be generally useful for detecting interactions of proteins within families having many paralogs.

View Article: PubMed Central - PubMed

Affiliation: Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America.

ABSTRACT
We consider the problem of how to detect cognate pairs of proteins that bind when each belongs to a large family of paralogs. To illustrate the problem, we have undertaken a genomewide analysis of interactions of members of the PE and PPE protein families of Mycobacterium tuberculosis. Our computational method uses structural information, operon organization, and protein coevolution to infer the interaction of PE and PPE proteins. Some 289 PE/PPE complexes were predicted out of a possible 5,590 PE/PPE pairs genomewide. Thirty-five of these predicted complexes were also found to have correlated mRNA expression, providing additional evidence for these interactions. We show that our method is applicable to other protein families, by analyzing interactions of the Esx family of proteins. Our resulting set of predictions is a starting point for genomewide experimental interaction screens of the PE and PPE families, and our method may be generally useful for detecting interactions of proteins within families having many paralogs.

Show MeSH
Related in: MedlinePlus