Limits...
A cis-regulatory logic simulator.

Zeigler RD, Gertz J, Cohen BA - BMC Bioinformatics (2007)

Bottom Line: We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data.The source code is available online and as additional material.The test sets are available as additional material.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA. rdzeigle@wustl.edu <rdzeigle@wustl.edu>

ABSTRACT

Background: A major goal of computational studies of gene regulation is to accurately predict the expression of genes based on the cis-regulatory content of their promoters. The development of computational methods to decode the interactions among cis-regulatory elements has been slow, in part, because it is difficult to know, without extensive experimental validation, whether a particular method identifies the correct cis-regulatory interactions that underlie a given set of expression data. There is an urgent need for test expression data in which the interactions among cis-regulatory sites that produce the data are known. The ability to rapidly generate such data sets would facilitate the development and comparison of computational methods that predict gene expression patterns from promoter sequence.

Results: We developed a gene expression simulator which generates expression data using user-defined interactions between cis-regulatory sites. The simulator can incorporate additive, cooperative, competitive, and synergistic interactions between regulatory elements. Constraints on the spacing, distance, and orientation of regulatory elements and their interactions may also be defined and Gaussian noise can be added to the expression values. The simulator allows for a data transformation that simulates the sigmoid shape of expression levels from real promoters. We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data. We present several data sets that may be useful for testing new methodologies for predicting gene expression from promoter sequence.

Conclusion: We developed a flexible gene expression simulator that rapidly generates large numbers of simulated promoters and their corresponding transcriptional output based on specified interactions between cis-regulatory sites. When appropriate rule sets are used, the data generated by our simulator faithfully reproduces experimentally derived data sets. We anticipate that using simulated gene expression data sets will facilitate the direct comparison of computational strategies to predict gene expression from promoter sequence. The source code is available online and as additional material. The test sets are available as additional material.

Show MeSH

Related in: MedlinePlus

Sample Relos outputs. Relos was used to generate and analyze promoters using four different models. Five thousand promoters were generated in all Figure 2 simulations. A. A simulation that depicts a single activator, modeled as an additive rule. B. A simulation that depicts an activator and a repressor modeled as additive rules. C. A simulation that depicts a synergistic rule between two regulatory elements. Each element has a small additive contribution to expression, but promoters with at least one of each element have enhanced expression. Gaussian noise was added to the output of the simulation at 5% of the level of expression of individual promoters. D. A simulation that depicts a cooperative interaction between two regulatory elements modeled with a hill function. Noise was added to the simulation as in C.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2375358&req=5

Figure 2: Sample Relos outputs. Relos was used to generate and analyze promoters using four different models. Five thousand promoters were generated in all Figure 2 simulations. A. A simulation that depicts a single activator, modeled as an additive rule. B. A simulation that depicts an activator and a repressor modeled as additive rules. C. A simulation that depicts a synergistic rule between two regulatory elements. Each element has a small additive contribution to expression, but promoters with at least one of each element have enhanced expression. Gaussian noise was added to the output of the simulation at 5% of the level of expression of individual promoters. D. A simulation that depicts a cooperative interaction between two regulatory elements modeled with a hill function. Noise was added to the simulation as in C.

Mentions: Examples of simulated datasets are shown in Figure 2. As a visual aid to interpret the output of the simulations, histograms illustrating the distribution of expression values are shown. Figure 2a shows the distribution of expression values for 5000 fixed-length random promoters consisting of variable numbers of a single type of cis-regulatory activator site and neutral spacer elements, where all elements are equally probable. The expression is therefore a reflection of the distribution of the activator element. Relos outputs the expected Poisson distribution for expression. Figure 2b shows the results from an activator-repressor combination. Because expression is now a function of two inputs, it follows the expected Gaussian distribution. Figure 2c shows the results from a synergistic rule set, with noise at 5% of the expression level. In this simulation, each element has a small additive effect on expression individually, but when both regulatory elements are present in the same promoter, a large expression effect is observed. As expected, the result of the simulation is a bimodal distribution, where the second peak represents promoters containing both regulatory elements. Figure 2d shows the output of a cooperative interaction, modeled by a Hill function. A Hill function is a transition function of the form:


A cis-regulatory logic simulator.

Zeigler RD, Gertz J, Cohen BA - BMC Bioinformatics (2007)

Sample Relos outputs. Relos was used to generate and analyze promoters using four different models. Five thousand promoters were generated in all Figure 2 simulations. A. A simulation that depicts a single activator, modeled as an additive rule. B. A simulation that depicts an activator and a repressor modeled as additive rules. C. A simulation that depicts a synergistic rule between two regulatory elements. Each element has a small additive contribution to expression, but promoters with at least one of each element have enhanced expression. Gaussian noise was added to the output of the simulation at 5% of the level of expression of individual promoters. D. A simulation that depicts a cooperative interaction between two regulatory elements modeled with a hill function. Noise was added to the simulation as in C.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2375358&req=5

Figure 2: Sample Relos outputs. Relos was used to generate and analyze promoters using four different models. Five thousand promoters were generated in all Figure 2 simulations. A. A simulation that depicts a single activator, modeled as an additive rule. B. A simulation that depicts an activator and a repressor modeled as additive rules. C. A simulation that depicts a synergistic rule between two regulatory elements. Each element has a small additive contribution to expression, but promoters with at least one of each element have enhanced expression. Gaussian noise was added to the output of the simulation at 5% of the level of expression of individual promoters. D. A simulation that depicts a cooperative interaction between two regulatory elements modeled with a hill function. Noise was added to the simulation as in C.
Mentions: Examples of simulated datasets are shown in Figure 2. As a visual aid to interpret the output of the simulations, histograms illustrating the distribution of expression values are shown. Figure 2a shows the distribution of expression values for 5000 fixed-length random promoters consisting of variable numbers of a single type of cis-regulatory activator site and neutral spacer elements, where all elements are equally probable. The expression is therefore a reflection of the distribution of the activator element. Relos outputs the expected Poisson distribution for expression. Figure 2b shows the results from an activator-repressor combination. Because expression is now a function of two inputs, it follows the expected Gaussian distribution. Figure 2c shows the results from a synergistic rule set, with noise at 5% of the expression level. In this simulation, each element has a small additive effect on expression individually, but when both regulatory elements are present in the same promoter, a large expression effect is observed. As expected, the result of the simulation is a bimodal distribution, where the second peak represents promoters containing both regulatory elements. Figure 2d shows the output of a cooperative interaction, modeled by a Hill function. A Hill function is a transition function of the form:

Bottom Line: We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data.The source code is available online and as additional material.The test sets are available as additional material.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA. rdzeigle@wustl.edu <rdzeigle@wustl.edu>

ABSTRACT

Background: A major goal of computational studies of gene regulation is to accurately predict the expression of genes based on the cis-regulatory content of their promoters. The development of computational methods to decode the interactions among cis-regulatory elements has been slow, in part, because it is difficult to know, without extensive experimental validation, whether a particular method identifies the correct cis-regulatory interactions that underlie a given set of expression data. There is an urgent need for test expression data in which the interactions among cis-regulatory sites that produce the data are known. The ability to rapidly generate such data sets would facilitate the development and comparison of computational methods that predict gene expression patterns from promoter sequence.

Results: We developed a gene expression simulator which generates expression data using user-defined interactions between cis-regulatory sites. The simulator can incorporate additive, cooperative, competitive, and synergistic interactions between regulatory elements. Constraints on the spacing, distance, and orientation of regulatory elements and their interactions may also be defined and Gaussian noise can be added to the expression values. The simulator allows for a data transformation that simulates the sigmoid shape of expression levels from real promoters. We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data. We present several data sets that may be useful for testing new methodologies for predicting gene expression from promoter sequence.

Conclusion: We developed a flexible gene expression simulator that rapidly generates large numbers of simulated promoters and their corresponding transcriptional output based on specified interactions between cis-regulatory sites. When appropriate rule sets are used, the data generated by our simulator faithfully reproduces experimentally derived data sets. We anticipate that using simulated gene expression data sets will facilitate the direct comparison of computational strategies to predict gene expression from promoter sequence. The source code is available online and as additional material. The test sets are available as additional material.

Show MeSH
Related in: MedlinePlus