Limits...
Identification of yeast transcriptional regulation networks using multivariate random forests.

Xiao Y, Segal MR - PLoS Comput. Biol. (2009)

Bottom Line: In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes.Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing.These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes.

View Article: PubMed Central - PubMed

Affiliation: Department of Epidemiology and Biostatistics, Center for Bioinformatics and Molecular Biostatistics, University of California, San Francisco, California, USA. Yuanyuan.Xiao@ucsf.edu

ABSTRACT
The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes.

Show MeSH

Related in: MedlinePlus

RC diagrams for (A) sporulation (B) heat shock and (C) nitrogen deplection.The top section shows that dendrogram of hierarchical clustering of the average expression profiles (in log2-ratios) within each RC based on Pearson correlation and average linkage. The bottom section depicts signature motifs in the corresponding RC. The color red indicates enrichment  by a Chi-square test of association; the color blue corresponds to the depletion . The color bar at the lower right hand side is in  scale and the color signals the direction of the test.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2691601&req=5

pcbi-1000414-g010: RC diagrams for (A) sporulation (B) heat shock and (C) nitrogen deplection.The top section shows that dendrogram of hierarchical clustering of the average expression profiles (in log2-ratios) within each RC based on Pearson correlation and average linkage. The bottom section depicts signature motifs in the corresponding RC. The color red indicates enrichment by a Chi-square test of association; the color blue corresponds to the depletion . The color bar at the lower right hand side is in scale and the color signals the direction of the test.

Mentions: The RC diagram of the sporulation data set [23] is clustered into two distinct groups that exhibit increased and decreased expression upon entering sporulation respectively (Figure 10A). The direction of the transcription response to sporulation is clearly associated with the presence of the mRRPE motif, which is the rRNA processing element. The expression of genes that possess the mRRPE motifs, or combinations of the rRNA synthesis/processing motifs (PAC, mRRPE, mRRSE3 and mRRSE10), is repressed throughout the sporulation process. Such repression is also seen in genes that have the RAP1 motif. This corroborating evidence of a decline in gene expression relating to the production of the ribosomal machinery may be the result of a growth respite caused by nitrogen starvation in order to trigger the sporulation process. Interestingly, we have identified RCs of different combinations of this group of rRNA-related motifs: mRRPE, PAC-mRRPE, PAC-mRRSE3 and mRRPE-PAC-mRRSE3. The composition of these combinations have differing consequences for the profiles, and magnitudes, of expression changes further highlighting the combinatorial transcription control of rRNA processing. Among the genes that are induced upon entering sporulation three distinctive RCs emerge: URS1-SCB, MCB, and RPN4-mPROTEOL18. This is consistent with previous studies that suggest the involvement of cell cycle (MCB and SCB; [6],[33]) and stress (RPN4 and mPROTEOL18; [6]) motifs in sporulation. URS1 is the binding site of the Ume6/lme1 complex which is the major transcriptional regulator of genes involved in early phase meiosis [23].


Identification of yeast transcriptional regulation networks using multivariate random forests.

Xiao Y, Segal MR - PLoS Comput. Biol. (2009)

RC diagrams for (A) sporulation (B) heat shock and (C) nitrogen deplection.The top section shows that dendrogram of hierarchical clustering of the average expression profiles (in log2-ratios) within each RC based on Pearson correlation and average linkage. The bottom section depicts signature motifs in the corresponding RC. The color red indicates enrichment  by a Chi-square test of association; the color blue corresponds to the depletion . The color bar at the lower right hand side is in  scale and the color signals the direction of the test.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2691601&req=5

pcbi-1000414-g010: RC diagrams for (A) sporulation (B) heat shock and (C) nitrogen deplection.The top section shows that dendrogram of hierarchical clustering of the average expression profiles (in log2-ratios) within each RC based on Pearson correlation and average linkage. The bottom section depicts signature motifs in the corresponding RC. The color red indicates enrichment by a Chi-square test of association; the color blue corresponds to the depletion . The color bar at the lower right hand side is in scale and the color signals the direction of the test.
Mentions: The RC diagram of the sporulation data set [23] is clustered into two distinct groups that exhibit increased and decreased expression upon entering sporulation respectively (Figure 10A). The direction of the transcription response to sporulation is clearly associated with the presence of the mRRPE motif, which is the rRNA processing element. The expression of genes that possess the mRRPE motifs, or combinations of the rRNA synthesis/processing motifs (PAC, mRRPE, mRRSE3 and mRRSE10), is repressed throughout the sporulation process. Such repression is also seen in genes that have the RAP1 motif. This corroborating evidence of a decline in gene expression relating to the production of the ribosomal machinery may be the result of a growth respite caused by nitrogen starvation in order to trigger the sporulation process. Interestingly, we have identified RCs of different combinations of this group of rRNA-related motifs: mRRPE, PAC-mRRPE, PAC-mRRSE3 and mRRPE-PAC-mRRSE3. The composition of these combinations have differing consequences for the profiles, and magnitudes, of expression changes further highlighting the combinatorial transcription control of rRNA processing. Among the genes that are induced upon entering sporulation three distinctive RCs emerge: URS1-SCB, MCB, and RPN4-mPROTEOL18. This is consistent with previous studies that suggest the involvement of cell cycle (MCB and SCB; [6],[33]) and stress (RPN4 and mPROTEOL18; [6]) motifs in sporulation. URS1 is the binding site of the Ume6/lme1 complex which is the major transcriptional regulator of genes involved in early phase meiosis [23].

Bottom Line: In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes.Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing.These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes.

View Article: PubMed Central - PubMed

Affiliation: Department of Epidemiology and Biostatistics, Center for Bioinformatics and Molecular Biostatistics, University of California, San Francisco, California, USA. Yuanyuan.Xiao@ucsf.edu

ABSTRACT
The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes.

Show MeSH
Related in: MedlinePlus