Limits...
Discovering structural cis-regulatory elements by modeling the behaviors of mRNAs.

Foat BC, Stormo GD - Mol. Syst. Biol. (2009)

Bottom Line: In addition, we discovered six putative SCREs in flies and three in humans.We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts.Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Center for Genome Sciences, Washington University School of Medicine, St Louis, MO 63108, USA.

ABSTRACT
Gene expression is regulated at each step from chromatin remodeling through translation and degradation. Several known RNA-binding regulatory proteins interact with specific RNA secondary structures in addition to specific nucleotides. To provide a more comprehensive understanding of the regulation of gene expression, we developed an integrative computational approach that leverages functional genomics data and nucleotide sequences to discover RNA secondary structure-defined cis-regulatory elements (SCREs). We applied our structural cis-regulatory element detector (StructRED) to microarray and mRNA sequence data from Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We recovered the known specificities of Vts1p in yeast and Smaug in flies. In addition, we discovered six putative SCREs in flies and three in humans. We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts. Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

Show MeSH
Vts1p and Smaug activities. Each square represents the strength of the correlation between genome-wide occurrences of a SCRE and genome-wide mRNA measurements for a particular microarray experiment. Yellow represents a positive correlation and blue represents a negative correlation. An absolute t-value of about 6.7 corresponds to a P-value of 0.01, when strictly correcting for the number of motifs tested. (A) The Vts1p specificities for the length four loop (Vts1–4) and length five loop (Vts1–5) were discovered using microarray data measured mRNA association with Vts1p in a pull-down experiment in four trials (Aviv et al, 2006b). (B) The Smaug specificities for the length four (Smg-4) and length five (Smg-5) loops were discovered using mRNA expression microarray data performed over Drosophila melanogaster embryonic development. The first two time courses measured the first 6 h of development in Δsmg and wild-type (WT) activated eggs (Tadros et al, 2007). The third time course (Pilot et al, 2006) compares the slow phase (T1), fast phase (T2), cellularization and beginning gastrulation (T3), and end of gastrulation (T4) to embryos before zygotic transcription begins in wild-type (WT) embryos. (C) Occurrences of the Smg-4 and Smg-5 specificities also had strong negative correlations (corrected P-value <0.001) with ribosome association in the first 2 h of development (Qin et al, 2007). Triangles represent increasing density of sucrose gradient fractions, corresponding to increasing numbers of ribosomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2683727&req=5

f3: Vts1p and Smaug activities. Each square represents the strength of the correlation between genome-wide occurrences of a SCRE and genome-wide mRNA measurements for a particular microarray experiment. Yellow represents a positive correlation and blue represents a negative correlation. An absolute t-value of about 6.7 corresponds to a P-value of 0.01, when strictly correcting for the number of motifs tested. (A) The Vts1p specificities for the length four loop (Vts1–4) and length five loop (Vts1–5) were discovered using microarray data measured mRNA association with Vts1p in a pull-down experiment in four trials (Aviv et al, 2006b). (B) The Smaug specificities for the length four (Smg-4) and length five (Smg-5) loops were discovered using mRNA expression microarray data performed over Drosophila melanogaster embryonic development. The first two time courses measured the first 6 h of development in Δsmg and wild-type (WT) activated eggs (Tadros et al, 2007). The third time course (Pilot et al, 2006) compares the slow phase (T1), fast phase (T2), cellularization and beginning gastrulation (T3), and end of gastrulation (T4) to embryos before zygotic transcription begins in wild-type (WT) embryos. (C) Occurrences of the Smg-4 and Smg-5 specificities also had strong negative correlations (corrected P-value <0.001) with ribosome association in the first 2 h of development (Qin et al, 2007). Triangles represent increasing density of sucrose gradient fractions, corresponding to increasing numbers of ribosomes.

Mentions: We applied the StructRED algorithm to search for any stem–loop SCREs in the wild-type versus Δvts1 (Oberstrass et al, 2006) and the Vts1p pull-down (Aviv et al, 2006b) microarray data in addition to approximately 6500 other microarray experiments retrieved from the NCBI GEO (Barrett et al, 2007). We confirmed the specificity of Vts1p (Figure 2) using the pull-down microarray data (Aviv et al, 2006b; Figure 3A). This Vts1p specificity is in good agreement with the Vts1p specificity shown in earlier work (Aviv et al, 2006a, 2006b; Edwards et al, 2006; Johnson and Donaldson, 2006; Oberstrass et al, 2006). Thus, StructRED successfully performs the task for which it was designed—to detect SCREs based on genome-wide measurements of the effects that their occurrences exert on mRNAs. Those mRNAs that we predict are most likely to contain Vts1p SCREs are enriched for functional categories involving carbohydrate metabolism and transmembrane transport (Supplementary Table 2). However, too little is known about the biological role of Vts1p to draw conclusions from these observations.


Discovering structural cis-regulatory elements by modeling the behaviors of mRNAs.

Foat BC, Stormo GD - Mol. Syst. Biol. (2009)

Vts1p and Smaug activities. Each square represents the strength of the correlation between genome-wide occurrences of a SCRE and genome-wide mRNA measurements for a particular microarray experiment. Yellow represents a positive correlation and blue represents a negative correlation. An absolute t-value of about 6.7 corresponds to a P-value of 0.01, when strictly correcting for the number of motifs tested. (A) The Vts1p specificities for the length four loop (Vts1–4) and length five loop (Vts1–5) were discovered using microarray data measured mRNA association with Vts1p in a pull-down experiment in four trials (Aviv et al, 2006b). (B) The Smaug specificities for the length four (Smg-4) and length five (Smg-5) loops were discovered using mRNA expression microarray data performed over Drosophila melanogaster embryonic development. The first two time courses measured the first 6 h of development in Δsmg and wild-type (WT) activated eggs (Tadros et al, 2007). The third time course (Pilot et al, 2006) compares the slow phase (T1), fast phase (T2), cellularization and beginning gastrulation (T3), and end of gastrulation (T4) to embryos before zygotic transcription begins in wild-type (WT) embryos. (C) Occurrences of the Smg-4 and Smg-5 specificities also had strong negative correlations (corrected P-value <0.001) with ribosome association in the first 2 h of development (Qin et al, 2007). Triangles represent increasing density of sucrose gradient fractions, corresponding to increasing numbers of ribosomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2683727&req=5

f3: Vts1p and Smaug activities. Each square represents the strength of the correlation between genome-wide occurrences of a SCRE and genome-wide mRNA measurements for a particular microarray experiment. Yellow represents a positive correlation and blue represents a negative correlation. An absolute t-value of about 6.7 corresponds to a P-value of 0.01, when strictly correcting for the number of motifs tested. (A) The Vts1p specificities for the length four loop (Vts1–4) and length five loop (Vts1–5) were discovered using microarray data measured mRNA association with Vts1p in a pull-down experiment in four trials (Aviv et al, 2006b). (B) The Smaug specificities for the length four (Smg-4) and length five (Smg-5) loops were discovered using mRNA expression microarray data performed over Drosophila melanogaster embryonic development. The first two time courses measured the first 6 h of development in Δsmg and wild-type (WT) activated eggs (Tadros et al, 2007). The third time course (Pilot et al, 2006) compares the slow phase (T1), fast phase (T2), cellularization and beginning gastrulation (T3), and end of gastrulation (T4) to embryos before zygotic transcription begins in wild-type (WT) embryos. (C) Occurrences of the Smg-4 and Smg-5 specificities also had strong negative correlations (corrected P-value <0.001) with ribosome association in the first 2 h of development (Qin et al, 2007). Triangles represent increasing density of sucrose gradient fractions, corresponding to increasing numbers of ribosomes.
Mentions: We applied the StructRED algorithm to search for any stem–loop SCREs in the wild-type versus Δvts1 (Oberstrass et al, 2006) and the Vts1p pull-down (Aviv et al, 2006b) microarray data in addition to approximately 6500 other microarray experiments retrieved from the NCBI GEO (Barrett et al, 2007). We confirmed the specificity of Vts1p (Figure 2) using the pull-down microarray data (Aviv et al, 2006b; Figure 3A). This Vts1p specificity is in good agreement with the Vts1p specificity shown in earlier work (Aviv et al, 2006a, 2006b; Edwards et al, 2006; Johnson and Donaldson, 2006; Oberstrass et al, 2006). Thus, StructRED successfully performs the task for which it was designed—to detect SCREs based on genome-wide measurements of the effects that their occurrences exert on mRNAs. Those mRNAs that we predict are most likely to contain Vts1p SCREs are enriched for functional categories involving carbohydrate metabolism and transmembrane transport (Supplementary Table 2). However, too little is known about the biological role of Vts1p to draw conclusions from these observations.

Bottom Line: In addition, we discovered six putative SCREs in flies and three in humans.We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts.Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Center for Genome Sciences, Washington University School of Medicine, St Louis, MO 63108, USA.

ABSTRACT
Gene expression is regulated at each step from chromatin remodeling through translation and degradation. Several known RNA-binding regulatory proteins interact with specific RNA secondary structures in addition to specific nucleotides. To provide a more comprehensive understanding of the regulation of gene expression, we developed an integrative computational approach that leverages functional genomics data and nucleotide sequences to discover RNA secondary structure-defined cis-regulatory elements (SCREs). We applied our structural cis-regulatory element detector (StructRED) to microarray and mRNA sequence data from Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We recovered the known specificities of Vts1p in yeast and Smaug in flies. In addition, we discovered six putative SCREs in flies and three in humans. We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts. Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

Show MeSH