Limits...
Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana.

Redestig H, Weicht D, Selbig J, Hannah MA - BMC Bioinformatics (2007)

Bottom Line: The central role of transcription factors (TFs) in higher eukaryotes has led to much interest in deciphering transcriptional regulatory interactions.We applied our method to published TF - target gene relationships determined using expression profiling on TF mutants and show that in most cases we obtain significant target gene enrichment and in half of the cases this is sufficient to deliver a usable list of high-confidence target genes.In the future, we believe its incorporation with other forms of evidence may improve integrative genome-wide predictions of transcriptional networks.

View Article: PubMed Central - HTML - PubMed

Affiliation: Max Planck Institute for Molecular Plant Physiology, Am M├╝hlenberg 1, D-14476 Potsdam-Golm, Germany. redestig@mpimp-golm.mpg.de

ABSTRACT

Background: The central role of transcription factors (TFs) in higher eukaryotes has led to much interest in deciphering transcriptional regulatory interactions. Even in the best case, experimental identification of TF target genes is error prone, and has been shown to be improved by considering additional forms of evidence such as expression data. Previous expression based methods have not explicitly tried to associate TFs with their targets and therefore largely ignored the treatment specific and time dependent nature of transcription regulation.

Results: In this study we introduce CERMT, Covariance based Extraction of Regulatory targets using Multiple Time series. Using simulated and real data we show that using multiple expression time series, selecting treatments in which the TF responds, allowing time shifts between TFs and their targets and using covariance to identify highly responding genes appear to be a good strategy. We applied our method to published TF - target gene relationships determined using expression profiling on TF mutants and show that in most cases we obtain significant target gene enrichment and in half of the cases this is sufficient to deliver a usable list of high-confidence target genes.

Conclusion: CERMT could be immediately useful in refining possible target genes of candidate TFs using publicly available data, particularly for organisms lacking comprehensive TF binding data. In the future, we believe its incorporation with other forms of evidence may improve integrative genome-wide predictions of transcriptional networks.

Show MeSH
A flowchart of the CERMT algorithm. The input is a set of microarray time series for several treatments and a transcription factor (TF) of interest. First the treatments in which the TF does not respond are removed. Then a pair of treatments are selected for which the same genes are highly covariant with the TF. The rest of the treatments are then searched and added or discarded depending on a goodness-of-fit test. Finally a cut-off for the gene list ordered by their covariance with the, possibly time shifted, TF in the selected treatments is estimated via the Gap statistic.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2198923&req=5

Figure 1: A flowchart of the CERMT algorithm. The input is a set of microarray time series for several treatments and a transcription factor (TF) of interest. First the treatments in which the TF does not respond are removed. Then a pair of treatments are selected for which the same genes are highly covariant with the TF. The rest of the treatments are then searched and added or discarded depending on a goodness-of-fit test. Finally a cut-off for the gene list ordered by their covariance with the, possibly time shifted, TF in the selected treatments is estimated via the Gap statistic.

Mentions: Given a set of gene expression time series and a TF of interest, the output of the proposed method is a cluster of co-expressed genes that, given the assumptions above, look like they are controlled by the TF of interest. Because the cluster is directly associated with a known TF, we will instead refer to it as a predicted regulon. Figure 1 shows a flow scheme of the proposed algorithm, and below we outline the main strategies.


Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana.

Redestig H, Weicht D, Selbig J, Hannah MA - BMC Bioinformatics (2007)

A flowchart of the CERMT algorithm. The input is a set of microarray time series for several treatments and a transcription factor (TF) of interest. First the treatments in which the TF does not respond are removed. Then a pair of treatments are selected for which the same genes are highly covariant with the TF. The rest of the treatments are then searched and added or discarded depending on a goodness-of-fit test. Finally a cut-off for the gene list ordered by their covariance with the, possibly time shifted, TF in the selected treatments is estimated via the Gap statistic.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2198923&req=5

Figure 1: A flowchart of the CERMT algorithm. The input is a set of microarray time series for several treatments and a transcription factor (TF) of interest. First the treatments in which the TF does not respond are removed. Then a pair of treatments are selected for which the same genes are highly covariant with the TF. The rest of the treatments are then searched and added or discarded depending on a goodness-of-fit test. Finally a cut-off for the gene list ordered by their covariance with the, possibly time shifted, TF in the selected treatments is estimated via the Gap statistic.
Mentions: Given a set of gene expression time series and a TF of interest, the output of the proposed method is a cluster of co-expressed genes that, given the assumptions above, look like they are controlled by the TF of interest. Because the cluster is directly associated with a known TF, we will instead refer to it as a predicted regulon. Figure 1 shows a flow scheme of the proposed algorithm, and below we outline the main strategies.

Bottom Line: The central role of transcription factors (TFs) in higher eukaryotes has led to much interest in deciphering transcriptional regulatory interactions.We applied our method to published TF - target gene relationships determined using expression profiling on TF mutants and show that in most cases we obtain significant target gene enrichment and in half of the cases this is sufficient to deliver a usable list of high-confidence target genes.In the future, we believe its incorporation with other forms of evidence may improve integrative genome-wide predictions of transcriptional networks.

View Article: PubMed Central - HTML - PubMed

Affiliation: Max Planck Institute for Molecular Plant Physiology, Am M├╝hlenberg 1, D-14476 Potsdam-Golm, Germany. redestig@mpimp-golm.mpg.de

ABSTRACT

Background: The central role of transcription factors (TFs) in higher eukaryotes has led to much interest in deciphering transcriptional regulatory interactions. Even in the best case, experimental identification of TF target genes is error prone, and has been shown to be improved by considering additional forms of evidence such as expression data. Previous expression based methods have not explicitly tried to associate TFs with their targets and therefore largely ignored the treatment specific and time dependent nature of transcription regulation.

Results: In this study we introduce CERMT, Covariance based Extraction of Regulatory targets using Multiple Time series. Using simulated and real data we show that using multiple expression time series, selecting treatments in which the TF responds, allowing time shifts between TFs and their targets and using covariance to identify highly responding genes appear to be a good strategy. We applied our method to published TF - target gene relationships determined using expression profiling on TF mutants and show that in most cases we obtain significant target gene enrichment and in half of the cases this is sufficient to deliver a usable list of high-confidence target genes.

Conclusion: CERMT could be immediately useful in refining possible target genes of candidate TFs using publicly available data, particularly for organisms lacking comprehensive TF binding data. In the future, we believe its incorporation with other forms of evidence may improve integrative genome-wide predictions of transcriptional networks.

Show MeSH