Limits...
Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences.

Nguyen Ba AN, Strome B, Hua JJ, Desmond J, Gagnon-Arsenault I, Weiss EL, Landry CR, Moses AM - PLoS Comput. Biol. (2014)

Bottom Line: We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes.We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation.Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

View Article: PubMed Central - PubMed

Affiliation: Department of Cell & Systems Biology, University of Toronto, Toronto, Canada; Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada.

ABSTRACT
Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

Show MeSH
Likelihood-ratio test on short linear motifs after gene duplication on simulated data.A) Schematic of the motif-specific likelihoodratio test applied to all motifs. Rates of evolution are computed for each motif before (αpre-WGD) and after (αWGD) gene duplication and compared with the rates that were observed for the whole protein (see Methods). Red double arrow illustrates the duplication event. Bolded clades are clades with significant changes in constraints. Striped patterned boxes indicate short linear motifs with significantly different rate of evolution. DKL indicates the expected deviation of the likelihood-ratio test from the whole protein. B) Alignment of the N-terminus of the Dbp1/Ded1 homologs illustrates the rate heterogeneity amongst columns and highlights the short length of a putative motif (black rectangle zoom). Blue shade represents the percentage identity. C) Alignment of the N-terminus of a simulated protein based on Dbp1/Ded1 using our ‘realistic’ simulation of evolution (see Methods). D) Histogram shows the p-value distribution obtained from set of protein sequences that were evolved as in C). Grey shaded area indicates the expected proportion of tests. Circles indicate the distribution of p-values obtained from the likelihood-ratio test described in A) when the test statistic is assumed to be chi-squared distributed (black circles) or non-central chi-squared distributed (white circles, “corrected”).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4256066&req=5

pcbi-1003977-g001: Likelihood-ratio test on short linear motifs after gene duplication on simulated data.A) Schematic of the motif-specific likelihoodratio test applied to all motifs. Rates of evolution are computed for each motif before (αpre-WGD) and after (αWGD) gene duplication and compared with the rates that were observed for the whole protein (see Methods). Red double arrow illustrates the duplication event. Bolded clades are clades with significant changes in constraints. Striped patterned boxes indicate short linear motifs with significantly different rate of evolution. DKL indicates the expected deviation of the likelihood-ratio test from the whole protein. B) Alignment of the N-terminus of the Dbp1/Ded1 homologs illustrates the rate heterogeneity amongst columns and highlights the short length of a putative motif (black rectangle zoom). Blue shade represents the percentage identity. C) Alignment of the N-terminus of a simulated protein based on Dbp1/Ded1 using our ‘realistic’ simulation of evolution (see Methods). D) Histogram shows the p-value distribution obtained from set of protein sequences that were evolved as in C). Grey shaded area indicates the expected proportion of tests. Circles indicate the distribution of p-values obtained from the likelihood-ratio test described in A) when the test statistic is assumed to be chi-squared distributed (black circles) or non-central chi-squared distributed (white circles, “corrected”).

Mentions: We have previously shown that short linear motifs can be predicted based on their conservation relative to their surrounding regions [23]. We sought to detect regulatory divergence in proteins by looking for statistical signals of lineage-specific evolutionary rate changes in predicted short linear motifs in multiple sequence alignments. Likelihood-ratio tests have previously been used to detect differences in rate of evolution of full-length yeast proteins after the whole-genome duplication [16]. We sought to perform essentially the same test to identify short linear motifs whose rate of evolution changed significantly after gene duplication. To do so, we first predicted short linear motifs within proteins of species that have diverged prior to the yeast whole-genome duplication (see Methods) and mapped the location of the predicted short linear motifs to the genes post-duplication (Fig. 1A). Using a likelihood-ratio test [38], we tested whether two rates of evolution (one for the post-duplication clade and one for the remainder of the phylogenetic tree) explain the data significantly better than one single rate of evolution common to the whole tree (see Methods). This test is performed once for genes that reverted to single-copy, and twice in retained duplicates (one for each post-WGD protein).


Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences.

Nguyen Ba AN, Strome B, Hua JJ, Desmond J, Gagnon-Arsenault I, Weiss EL, Landry CR, Moses AM - PLoS Comput. Biol. (2014)

Likelihood-ratio test on short linear motifs after gene duplication on simulated data.A) Schematic of the motif-specific likelihoodratio test applied to all motifs. Rates of evolution are computed for each motif before (αpre-WGD) and after (αWGD) gene duplication and compared with the rates that were observed for the whole protein (see Methods). Red double arrow illustrates the duplication event. Bolded clades are clades with significant changes in constraints. Striped patterned boxes indicate short linear motifs with significantly different rate of evolution. DKL indicates the expected deviation of the likelihood-ratio test from the whole protein. B) Alignment of the N-terminus of the Dbp1/Ded1 homologs illustrates the rate heterogeneity amongst columns and highlights the short length of a putative motif (black rectangle zoom). Blue shade represents the percentage identity. C) Alignment of the N-terminus of a simulated protein based on Dbp1/Ded1 using our ‘realistic’ simulation of evolution (see Methods). D) Histogram shows the p-value distribution obtained from set of protein sequences that were evolved as in C). Grey shaded area indicates the expected proportion of tests. Circles indicate the distribution of p-values obtained from the likelihood-ratio test described in A) when the test statistic is assumed to be chi-squared distributed (black circles) or non-central chi-squared distributed (white circles, “corrected”).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4256066&req=5

pcbi-1003977-g001: Likelihood-ratio test on short linear motifs after gene duplication on simulated data.A) Schematic of the motif-specific likelihoodratio test applied to all motifs. Rates of evolution are computed for each motif before (αpre-WGD) and after (αWGD) gene duplication and compared with the rates that were observed for the whole protein (see Methods). Red double arrow illustrates the duplication event. Bolded clades are clades with significant changes in constraints. Striped patterned boxes indicate short linear motifs with significantly different rate of evolution. DKL indicates the expected deviation of the likelihood-ratio test from the whole protein. B) Alignment of the N-terminus of the Dbp1/Ded1 homologs illustrates the rate heterogeneity amongst columns and highlights the short length of a putative motif (black rectangle zoom). Blue shade represents the percentage identity. C) Alignment of the N-terminus of a simulated protein based on Dbp1/Ded1 using our ‘realistic’ simulation of evolution (see Methods). D) Histogram shows the p-value distribution obtained from set of protein sequences that were evolved as in C). Grey shaded area indicates the expected proportion of tests. Circles indicate the distribution of p-values obtained from the likelihood-ratio test described in A) when the test statistic is assumed to be chi-squared distributed (black circles) or non-central chi-squared distributed (white circles, “corrected”).
Mentions: We have previously shown that short linear motifs can be predicted based on their conservation relative to their surrounding regions [23]. We sought to detect regulatory divergence in proteins by looking for statistical signals of lineage-specific evolutionary rate changes in predicted short linear motifs in multiple sequence alignments. Likelihood-ratio tests have previously been used to detect differences in rate of evolution of full-length yeast proteins after the whole-genome duplication [16]. We sought to perform essentially the same test to identify short linear motifs whose rate of evolution changed significantly after gene duplication. To do so, we first predicted short linear motifs within proteins of species that have diverged prior to the yeast whole-genome duplication (see Methods) and mapped the location of the predicted short linear motifs to the genes post-duplication (Fig. 1A). Using a likelihood-ratio test [38], we tested whether two rates of evolution (one for the post-duplication clade and one for the remainder of the phylogenetic tree) explain the data significantly better than one single rate of evolution common to the whole tree (see Methods). This test is performed once for genes that reverted to single-copy, and twice in retained duplicates (one for each post-WGD protein).

Bottom Line: We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes.We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation.Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

View Article: PubMed Central - PubMed

Affiliation: Department of Cell & Systems Biology, University of Toronto, Toronto, Canada; Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada.

ABSTRACT
Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

Show MeSH