Limits...
Coevolutionary analyses require phylogenetically deep alignments and better models to accurately detect inter-protein contacts within and between species.

Avila-Herrera A, Pollard KS - BMC Bioinformatics (2015)

Bottom Line: When biomolecules physically interact, natural selection operates on them jointly.Two commonly used distributions are anti-conservative and have high false positive rates in some scenarios, although the empirical distribution of scores performs reasonably well with deep alignments.We conclude that coevolutionary analysis of cross-species protein interactions holds great promise but requires sequencing many more species pairs.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics Graduate Program, University of California, San Francisco, USA. aram.avilaherrera@ucsf.edu.

ABSTRACT

Background: When biomolecules physically interact, natural selection operates on them jointly. Contacting positions in protein and RNA structures exhibit correlated patterns of sequence evolution due to constraints imposed by the interaction, and molecular arms races can develop between interacting proteins in pathogens and their hosts. To evaluate how well methods developed to detect coevolving residues within proteins can be adapted for cross-species, inter-protein analysis, we used statistical criteria to quantify the performance of these methods in detecting inter-protein residues within 8 angstroms of each other in the co-crystal structures of 33 bacterial protein interactions. We also evaluated their performance for detecting known residues at the interface of a host-virus protein complex with a partially solved structure.

Results: Our quantitative benchmarking showed that all coevolutionary methods clearly benefit from alignments with many sequences. Methods that aim to detect direct correlations generally outperform other approaches. However, faster mutual information based methods are occasionally competitive in small alignments and with relaxed false positive rates. Two commonly used distributions are anti-conservative and have high false positive rates in some scenarios, although the empirical distribution of scores performs reasonably well with deep alignments.

Conclusions: We conclude that coevolutionary analysis of cross-species protein interactions holds great promise but requires sequencing many more species pairs.

No MeSH data available.


Related in: MedlinePlus

Power (TPR), precision (PPV), and false positive rate (FPR) for predicting antiviral protein A3G residues (not pairs) essential for interacting with its viral antagonist Vif at Pempirical <α thresholds that maximize PPV for each coevolution method. Residues defined as positive are taken from previous functional mutation studies in Table 3. See Abbreviations and Table 1 for abbreviations
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4549020&req=5

Fig4: Power (TPR), precision (PPV), and false positive rate (FPR) for predicting antiviral protein A3G residues (not pairs) essential for interacting with its viral antagonist Vif at Pempirical <α thresholds that maximize PPV for each coevolution method. Residues defined as positive are taken from previous functional mutation studies in Table 3. See Abbreviations and Table 1 for abbreviations

Mentions: We observed similarly low performance on A3G (Fig. 4). Encouragingly, we note that positions 128-130 are correctly identified by multiple methods (Fig. 5). Residues at position 130 (e.g., D vs A) are highly likely to be adaptations that conferred species-specific resistance to Vif-induced degradation in Old World Monkeys 5-6MYA [54, 55]. Position 128, that also provides species-specific resistance, is thought to be more recent [54, 55, 62]. While these coevolution methods alone may not yet be accurate enough to identify functional residues, they potentially enhance other evolutionary analyses. For example, of the many Apobec sites under positive selection [55], it is reasonable that lentiviruses are more likely shaping the evolution of those sites that coevolve with Vif than sites that coevolve with other viral or virus-like agents.Fig. 4


Coevolutionary analyses require phylogenetically deep alignments and better models to accurately detect inter-protein contacts within and between species.

Avila-Herrera A, Pollard KS - BMC Bioinformatics (2015)

Power (TPR), precision (PPV), and false positive rate (FPR) for predicting antiviral protein A3G residues (not pairs) essential for interacting with its viral antagonist Vif at Pempirical <α thresholds that maximize PPV for each coevolution method. Residues defined as positive are taken from previous functional mutation studies in Table 3. See Abbreviations and Table 1 for abbreviations
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4549020&req=5

Fig4: Power (TPR), precision (PPV), and false positive rate (FPR) for predicting antiviral protein A3G residues (not pairs) essential for interacting with its viral antagonist Vif at Pempirical <α thresholds that maximize PPV for each coevolution method. Residues defined as positive are taken from previous functional mutation studies in Table 3. See Abbreviations and Table 1 for abbreviations
Mentions: We observed similarly low performance on A3G (Fig. 4). Encouragingly, we note that positions 128-130 are correctly identified by multiple methods (Fig. 5). Residues at position 130 (e.g., D vs A) are highly likely to be adaptations that conferred species-specific resistance to Vif-induced degradation in Old World Monkeys 5-6MYA [54, 55]. Position 128, that also provides species-specific resistance, is thought to be more recent [54, 55, 62]. While these coevolution methods alone may not yet be accurate enough to identify functional residues, they potentially enhance other evolutionary analyses. For example, of the many Apobec sites under positive selection [55], it is reasonable that lentiviruses are more likely shaping the evolution of those sites that coevolve with Vif than sites that coevolve with other viral or virus-like agents.Fig. 4

Bottom Line: When biomolecules physically interact, natural selection operates on them jointly.Two commonly used distributions are anti-conservative and have high false positive rates in some scenarios, although the empirical distribution of scores performs reasonably well with deep alignments.We conclude that coevolutionary analysis of cross-species protein interactions holds great promise but requires sequencing many more species pairs.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics Graduate Program, University of California, San Francisco, USA. aram.avilaherrera@ucsf.edu.

ABSTRACT

Background: When biomolecules physically interact, natural selection operates on them jointly. Contacting positions in protein and RNA structures exhibit correlated patterns of sequence evolution due to constraints imposed by the interaction, and molecular arms races can develop between interacting proteins in pathogens and their hosts. To evaluate how well methods developed to detect coevolving residues within proteins can be adapted for cross-species, inter-protein analysis, we used statistical criteria to quantify the performance of these methods in detecting inter-protein residues within 8 angstroms of each other in the co-crystal structures of 33 bacterial protein interactions. We also evaluated their performance for detecting known residues at the interface of a host-virus protein complex with a partially solved structure.

Results: Our quantitative benchmarking showed that all coevolutionary methods clearly benefit from alignments with many sequences. Methods that aim to detect direct correlations generally outperform other approaches. However, faster mutual information based methods are occasionally competitive in small alignments and with relaxed false positive rates. Two commonly used distributions are anti-conservative and have high false positive rates in some scenarios, although the empirical distribution of scores performs reasonably well with deep alignments.

Conclusions: We conclude that coevolutionary analysis of cross-species protein interactions holds great promise but requires sequencing many more species pairs.

No MeSH data available.


Related in: MedlinePlus