Limits...
Identification of RNA recognition elements in the Saccharomyces cerevisiae transcriptome.

Riordan DP, Herschlag D, Brown PO - Nucleic Acids Res. (2010)

Bottom Line: We computationally analyzed the sequences of Saccharomyces cerevisiae mRNAs bound in vivo by 29 specific RBPs, identifying eight novel candidate motifs and confirming or extending six earlier reported recognition elements.Biochemical selections for RNA sequences selectively recognized by 12 yeast RBPs yielded novel motifs bound by Pin4, Nsr1, Hrb1, Gbp2, Sgn1 and Mrn1, and recovered the known recognition elements for Puf3, She2, Vts1 and Whi3.Most of the RNA elements we uncovered were associated with coherent mRNA expression changes and were significantly conserved in related yeasts, supporting their functional importance and suggesting that the corresponding RNA-protein interactions are evolutionarily conserved.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, Stanford University School of Medicine, Stanford, California, USA. driordan@stanford.edu

ABSTRACT
Post-transcriptional regulation of gene expression, including mRNA localization, translation and decay, is ubiquitous yet still largely unexplored. How is the post-transcriptional regulatory program of each mRNA encoded in its sequence? Hundreds of specific RNA-binding proteins (RBPs) appear to play roles in mediating the post-transcriptional regulatory program, akin to the roles of specific DNA-binding proteins in transcription. As a step toward decoding the regulatory programs encoded in each mRNA, we focused on specific mRNA-protein interactions. We computationally analyzed the sequences of Saccharomyces cerevisiae mRNAs bound in vivo by 29 specific RBPs, identifying eight novel candidate motifs and confirming or extending six earlier reported recognition elements. Biochemical selections for RNA sequences selectively recognized by 12 yeast RBPs yielded novel motifs bound by Pin4, Nsr1, Hrb1, Gbp2, Sgn1 and Mrn1, and recovered the known recognition elements for Puf3, She2, Vts1 and Whi3. Most of the RNA elements we uncovered were associated with coherent mRNA expression changes and were significantly conserved in related yeasts, supporting their functional importance and suggesting that the corresponding RNA-protein interactions are evolutionarily conserved.

Show MeSH

Related in: MedlinePlus

Computationally identified sequence motifs enriched in mRNAs bound by specific RBPs. RNA motifs identified from analysis of mRNA target sequences are displayed in decreasing order of significance based on P-values for genome-wide enrichment. A pictogram (http://genes.mit.edu/pictogram.html) represents the regular expression patterns defined for FIRE motifs or the preferred base composition of the position-specific scoring matrices used for REFINE motifs. For each motif, the −log10 P-value of the significance of genome-wide enrichment for motif sites in targets is shown in a red color scale for separate regions of its mRNA targets (5′ = 200 bases upstream of start codon, CDS = protein coding sequence, 3′ = 200 bases downstream of stop codon). Arrows are shown for motifs with a forward strand bias, i.e. the reverse complement of the motif is not significantly enriched in targets (P > 0.01). All relevant P-values were calculated based on the hyper-geometric distribution. Asterisks denote motifs that correspond to previously reported binding sites for the associated RBP. Exact data values and supporting details are presented in Supplementary Data S1.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3045596&req=5

Figure 1: Computationally identified sequence motifs enriched in mRNAs bound by specific RBPs. RNA motifs identified from analysis of mRNA target sequences are displayed in decreasing order of significance based on P-values for genome-wide enrichment. A pictogram (http://genes.mit.edu/pictogram.html) represents the regular expression patterns defined for FIRE motifs or the preferred base composition of the position-specific scoring matrices used for REFINE motifs. For each motif, the −log10 P-value of the significance of genome-wide enrichment for motif sites in targets is shown in a red color scale for separate regions of its mRNA targets (5′ = 200 bases upstream of start codon, CDS = protein coding sequence, 3′ = 200 bases downstream of stop codon). Arrows are shown for motifs with a forward strand bias, i.e. the reverse complement of the motif is not significantly enriched in targets (P > 0.01). All relevant P-values were calculated based on the hyper-geometric distribution. Asterisks denote motifs that correspond to previously reported binding sites for the associated RBP. Exact data values and supporting details are presented in Supplementary Data S1.

Mentions: Although REFINE is not guaranteed to completely avoid the problem of predicting potentially non-specific motifs, we nevertheless found it to be a useful step toward addressing this issue, as supported by the subsequent analyses. We also applied another motif-finding program, FIRE, to the same dataset (9). The overall concordance between the results of FIRE and REFINE (Supplementary Data S1) provided additional confidence in the significance and robustness of our results. All non-palindromic RNA motifs exhibited a strand bias, in that the reverse-complement motifs were not significantly enriched in the corresponding RBP target mRNAs (hyper-geometric P > 0.01) (Figure 1). This strand-specific enrichment is expected for regulatory elements that function as RNA, but not necessarily for DNA sequence motifs. Fourteen distinct RNA motifs, six of which (Puf3-1, Puf4-1, Puf5-1, Pub1-1, Nab2-1 and Nrd3-1) matched previously known RBP binding sites (Figure 1), passed strict criteria in this integrated analysis. The putative recognition motifs that we found for three of the RBPs (Pab1-1, Khd1-1 and Vts1-1) differed from the reported specificities of these RBPs (13–15), suggesting that these motifs may be false positives, perhaps representing sequences recognized by other factors with similar sets of target genes. The remaining motifs are strong candidates for specific RNA elements bound by Puf2, Ssd1, Nsr1, YLL032C and Pin4, respectively. Several of the motifs predicted by REFINE (including Puf5-1, Puf2-1, Nsr1-1, YLL032C-1, Vts1-1, Pin4-1 and Nrd1-1) were not identified by standard MEME analysis of the same original input target sequences, suggesting that analyses using MEME alone may be unlikely to recover some of these elements (Supplementary Data S1).Figure 1.


Identification of RNA recognition elements in the Saccharomyces cerevisiae transcriptome.

Riordan DP, Herschlag D, Brown PO - Nucleic Acids Res. (2010)

Computationally identified sequence motifs enriched in mRNAs bound by specific RBPs. RNA motifs identified from analysis of mRNA target sequences are displayed in decreasing order of significance based on P-values for genome-wide enrichment. A pictogram (http://genes.mit.edu/pictogram.html) represents the regular expression patterns defined for FIRE motifs or the preferred base composition of the position-specific scoring matrices used for REFINE motifs. For each motif, the −log10 P-value of the significance of genome-wide enrichment for motif sites in targets is shown in a red color scale for separate regions of its mRNA targets (5′ = 200 bases upstream of start codon, CDS = protein coding sequence, 3′ = 200 bases downstream of stop codon). Arrows are shown for motifs with a forward strand bias, i.e. the reverse complement of the motif is not significantly enriched in targets (P > 0.01). All relevant P-values were calculated based on the hyper-geometric distribution. Asterisks denote motifs that correspond to previously reported binding sites for the associated RBP. Exact data values and supporting details are presented in Supplementary Data S1.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3045596&req=5

Figure 1: Computationally identified sequence motifs enriched in mRNAs bound by specific RBPs. RNA motifs identified from analysis of mRNA target sequences are displayed in decreasing order of significance based on P-values for genome-wide enrichment. A pictogram (http://genes.mit.edu/pictogram.html) represents the regular expression patterns defined for FIRE motifs or the preferred base composition of the position-specific scoring matrices used for REFINE motifs. For each motif, the −log10 P-value of the significance of genome-wide enrichment for motif sites in targets is shown in a red color scale for separate regions of its mRNA targets (5′ = 200 bases upstream of start codon, CDS = protein coding sequence, 3′ = 200 bases downstream of stop codon). Arrows are shown for motifs with a forward strand bias, i.e. the reverse complement of the motif is not significantly enriched in targets (P > 0.01). All relevant P-values were calculated based on the hyper-geometric distribution. Asterisks denote motifs that correspond to previously reported binding sites for the associated RBP. Exact data values and supporting details are presented in Supplementary Data S1.
Mentions: Although REFINE is not guaranteed to completely avoid the problem of predicting potentially non-specific motifs, we nevertheless found it to be a useful step toward addressing this issue, as supported by the subsequent analyses. We also applied another motif-finding program, FIRE, to the same dataset (9). The overall concordance between the results of FIRE and REFINE (Supplementary Data S1) provided additional confidence in the significance and robustness of our results. All non-palindromic RNA motifs exhibited a strand bias, in that the reverse-complement motifs were not significantly enriched in the corresponding RBP target mRNAs (hyper-geometric P > 0.01) (Figure 1). This strand-specific enrichment is expected for regulatory elements that function as RNA, but not necessarily for DNA sequence motifs. Fourteen distinct RNA motifs, six of which (Puf3-1, Puf4-1, Puf5-1, Pub1-1, Nab2-1 and Nrd3-1) matched previously known RBP binding sites (Figure 1), passed strict criteria in this integrated analysis. The putative recognition motifs that we found for three of the RBPs (Pab1-1, Khd1-1 and Vts1-1) differed from the reported specificities of these RBPs (13–15), suggesting that these motifs may be false positives, perhaps representing sequences recognized by other factors with similar sets of target genes. The remaining motifs are strong candidates for specific RNA elements bound by Puf2, Ssd1, Nsr1, YLL032C and Pin4, respectively. Several of the motifs predicted by REFINE (including Puf5-1, Puf2-1, Nsr1-1, YLL032C-1, Vts1-1, Pin4-1 and Nrd1-1) were not identified by standard MEME analysis of the same original input target sequences, suggesting that analyses using MEME alone may be unlikely to recover some of these elements (Supplementary Data S1).Figure 1.

Bottom Line: We computationally analyzed the sequences of Saccharomyces cerevisiae mRNAs bound in vivo by 29 specific RBPs, identifying eight novel candidate motifs and confirming or extending six earlier reported recognition elements.Biochemical selections for RNA sequences selectively recognized by 12 yeast RBPs yielded novel motifs bound by Pin4, Nsr1, Hrb1, Gbp2, Sgn1 and Mrn1, and recovered the known recognition elements for Puf3, She2, Vts1 and Whi3.Most of the RNA elements we uncovered were associated with coherent mRNA expression changes and were significantly conserved in related yeasts, supporting their functional importance and suggesting that the corresponding RNA-protein interactions are evolutionarily conserved.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, Stanford University School of Medicine, Stanford, California, USA. driordan@stanford.edu

ABSTRACT
Post-transcriptional regulation of gene expression, including mRNA localization, translation and decay, is ubiquitous yet still largely unexplored. How is the post-transcriptional regulatory program of each mRNA encoded in its sequence? Hundreds of specific RNA-binding proteins (RBPs) appear to play roles in mediating the post-transcriptional regulatory program, akin to the roles of specific DNA-binding proteins in transcription. As a step toward decoding the regulatory programs encoded in each mRNA, we focused on specific mRNA-protein interactions. We computationally analyzed the sequences of Saccharomyces cerevisiae mRNAs bound in vivo by 29 specific RBPs, identifying eight novel candidate motifs and confirming or extending six earlier reported recognition elements. Biochemical selections for RNA sequences selectively recognized by 12 yeast RBPs yielded novel motifs bound by Pin4, Nsr1, Hrb1, Gbp2, Sgn1 and Mrn1, and recovered the known recognition elements for Puf3, She2, Vts1 and Whi3. Most of the RNA elements we uncovered were associated with coherent mRNA expression changes and were significantly conserved in related yeasts, supporting their functional importance and suggesting that the corresponding RNA-protein interactions are evolutionarily conserved.

Show MeSH
Related in: MedlinePlus