Limits...
Discovering structural cis-regulatory elements by modeling the behaviors of mRNAs.

Foat BC, Stormo GD - Mol. Syst. Biol. (2009)

Bottom Line: In addition, we discovered six putative SCREs in flies and three in humans.We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts.Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Center for Genome Sciences, Washington University School of Medicine, St Louis, MO 63108, USA.

ABSTRACT
Gene expression is regulated at each step from chromatin remodeling through translation and degradation. Several known RNA-binding regulatory proteins interact with specific RNA secondary structures in addition to specific nucleotides. To provide a more comprehensive understanding of the regulation of gene expression, we developed an integrative computational approach that leverages functional genomics data and nucleotide sequences to discover RNA secondary structure-defined cis-regulatory elements (SCREs). We applied our structural cis-regulatory element detector (StructRED) to microarray and mRNA sequence data from Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We recovered the known specificities of Vts1p in yeast and Smaug in flies. In addition, we discovered six putative SCREs in flies and three in humans. We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts. Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

Show MeSH
Explanatory structural cis-regulatory element content of mRNA regions. These trans-factor activity profiles (TFAPs) are for all of the Drosophila SCREs over all of the same conditions shown in Figures 3 and 4. However, these TFAPs display how well each SCRE explained the measured RNA levels when occurrences of the SCREs are only scored in the 5′ untranslated regions (UTRs), 3′ UTRs, coding sequences (CDS), or full-length mRNAs. Thus, by comparing each subsequence TFAP to the full-length mRNA TFAP, one can see in which region of mRNAs functional instances of the SCRE tend to exist. Most of the SCREs have their strongest signal in the CDSs, followed by the 3′ UTRs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2683727&req=5

f5: Explanatory structural cis-regulatory element content of mRNA regions. These trans-factor activity profiles (TFAPs) are for all of the Drosophila SCREs over all of the same conditions shown in Figures 3 and 4. However, these TFAPs display how well each SCRE explained the measured RNA levels when occurrences of the SCREs are only scored in the 5′ untranslated regions (UTRs), 3′ UTRs, coding sequences (CDS), or full-length mRNAs. Thus, by comparing each subsequence TFAP to the full-length mRNA TFAP, one can see in which region of mRNAs functional instances of the SCRE tend to exist. Most of the SCREs have their strongest signal in the CDSs, followed by the 3′ UTRs.

Mentions: The SCRE discovery in this work was always performed on approximated full-length mRNAs. However, to answer the question of where the discovered SCREs commonly occur in the mRNAs, we scored the occurrences of each SCRE in the 5′ UTRs, 3′ UTRs, and coding sequences separately and then checked which of these mRNA subsequences performed best at explaining the microarray data. If most of the functional SCREs are in the 3′ UTRs, as is commonly assumed, then the TFAPs for the 3′ UTRs alone should be strongly significant and appear similar to the TFAPs when the full-length mRNA sequences are used. For most of the Drosophila SCREs, especially the Smaug SCREs, the occurrences that appear in the coding sequences perform best at explaining the microarray measurements of gene expression and polysome association (Figure 5). Thus, most of the functional sites for Dm2, Dm3, Dm4, Dm6, and the Smaug SCREs reside in coding sequences. Recent characterization of Smaug stability regulation of the Hsp83 transcript showed that all eight predicted binding sites for Smaug do indeed reside in the coding sequence (Semotok et al, 2008). Dm1, Dm5, and Dm6 still have appreciable signal in the 3′ UTRs, and Dm5 has signal in the 5′ UTRs. We also calculated the length-normalized scores for the UTRs and the coding sequences for each SCRE. Dm3, Dm4, Dm5, Dm6, and the Smaug SCREs had the highest concentration of binding sites in the same regions that strongly predicted expression (Supplementary Figure 4). Only Dm1 and Dm2 were inconsistent, with Dm1 having a higher density of sites in coding sequences, while the scores in the 3′ UTRs were more predictive, and Dm2 having a higher density of sites in the 3′ UTRs, while the scores in the coding sequences were more predictive. SCREs frequently appearing in coding sequences provides a strong argument for including whole transcripts when searching for cis-regulatory elements.


Discovering structural cis-regulatory elements by modeling the behaviors of mRNAs.

Foat BC, Stormo GD - Mol. Syst. Biol. (2009)

Explanatory structural cis-regulatory element content of mRNA regions. These trans-factor activity profiles (TFAPs) are for all of the Drosophila SCREs over all of the same conditions shown in Figures 3 and 4. However, these TFAPs display how well each SCRE explained the measured RNA levels when occurrences of the SCREs are only scored in the 5′ untranslated regions (UTRs), 3′ UTRs, coding sequences (CDS), or full-length mRNAs. Thus, by comparing each subsequence TFAP to the full-length mRNA TFAP, one can see in which region of mRNAs functional instances of the SCRE tend to exist. Most of the SCREs have their strongest signal in the CDSs, followed by the 3′ UTRs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2683727&req=5

f5: Explanatory structural cis-regulatory element content of mRNA regions. These trans-factor activity profiles (TFAPs) are for all of the Drosophila SCREs over all of the same conditions shown in Figures 3 and 4. However, these TFAPs display how well each SCRE explained the measured RNA levels when occurrences of the SCREs are only scored in the 5′ untranslated regions (UTRs), 3′ UTRs, coding sequences (CDS), or full-length mRNAs. Thus, by comparing each subsequence TFAP to the full-length mRNA TFAP, one can see in which region of mRNAs functional instances of the SCRE tend to exist. Most of the SCREs have their strongest signal in the CDSs, followed by the 3′ UTRs.
Mentions: The SCRE discovery in this work was always performed on approximated full-length mRNAs. However, to answer the question of where the discovered SCREs commonly occur in the mRNAs, we scored the occurrences of each SCRE in the 5′ UTRs, 3′ UTRs, and coding sequences separately and then checked which of these mRNA subsequences performed best at explaining the microarray data. If most of the functional SCREs are in the 3′ UTRs, as is commonly assumed, then the TFAPs for the 3′ UTRs alone should be strongly significant and appear similar to the TFAPs when the full-length mRNA sequences are used. For most of the Drosophila SCREs, especially the Smaug SCREs, the occurrences that appear in the coding sequences perform best at explaining the microarray measurements of gene expression and polysome association (Figure 5). Thus, most of the functional sites for Dm2, Dm3, Dm4, Dm6, and the Smaug SCREs reside in coding sequences. Recent characterization of Smaug stability regulation of the Hsp83 transcript showed that all eight predicted binding sites for Smaug do indeed reside in the coding sequence (Semotok et al, 2008). Dm1, Dm5, and Dm6 still have appreciable signal in the 3′ UTRs, and Dm5 has signal in the 5′ UTRs. We also calculated the length-normalized scores for the UTRs and the coding sequences for each SCRE. Dm3, Dm4, Dm5, Dm6, and the Smaug SCREs had the highest concentration of binding sites in the same regions that strongly predicted expression (Supplementary Figure 4). Only Dm1 and Dm2 were inconsistent, with Dm1 having a higher density of sites in coding sequences, while the scores in the 3′ UTRs were more predictive, and Dm2 having a higher density of sites in the 3′ UTRs, while the scores in the coding sequences were more predictive. SCREs frequently appearing in coding sequences provides a strong argument for including whole transcripts when searching for cis-regulatory elements.

Bottom Line: In addition, we discovered six putative SCREs in flies and three in humans.We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts.Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Center for Genome Sciences, Washington University School of Medicine, St Louis, MO 63108, USA.

ABSTRACT
Gene expression is regulated at each step from chromatin remodeling through translation and degradation. Several known RNA-binding regulatory proteins interact with specific RNA secondary structures in addition to specific nucleotides. To provide a more comprehensive understanding of the regulation of gene expression, we developed an integrative computational approach that leverages functional genomics data and nucleotide sequences to discover RNA secondary structure-defined cis-regulatory elements (SCREs). We applied our structural cis-regulatory element detector (StructRED) to microarray and mRNA sequence data from Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We recovered the known specificities of Vts1p in yeast and Smaug in flies. In addition, we discovered six putative SCREs in flies and three in humans. We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts. Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins.

Show MeSH