Limits...
Inferring activity changes of transcription factors by binding association with sorted expression profiles.

Cheng C, Yan X, Sun F, Li LM - BMC Bioinformatics (2007)

Bottom Line: To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively.The implications obtained from all three examples are consistent with established biological results.The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity.

View Article: PubMed Central - HTML - PubMed

Affiliation: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089-2910, USA. chaochen@usc.edu

ABSTRACT

Background: The identification of transcription factors (TFs) associated with a biological process is fundamental to understanding its regulatory mechanisms. From microarray data, however, the activity changes of TFs often cannot be directly observed due to their relatively low expression levels, post-transcriptional modifications, and other complications. Several approaches have been proposed to infer TF activity changes from microarray data. In some models, a linear relationship between gene expression and TF-gene binding strength is assumed. In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes.

Results: We propose a novel method, referred to as BASE (binding association with sorted expression), to infer TF activity changes from microarray expression profiles with the help of binding affinity data. It searches the maximum association between bind affinity profile of a TF and expression change profile along the direction of sorted differentiation. The method does not make hard target gene selection, rather, the significances of TF activity changes are evaluated by permutation tests of binding association at the end. To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively. The implications obtained from all three examples are consistent with established biological results. Moreover, the inferences suggest new and biological meaningful hypotheses for further investigation.

Conclusion: The proposed method makes transcription inference from profiles of expression and binding affinity. The same machinery can be used to deal with various kinds of binding affinity data. The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity. This method is easy to implement and can be routinely applied for transcriptional inferences in microarray studies.

Show MeSH

Related in: MedlinePlus

Consistency of inferred AC scores for TFs in the corresponding over-expression TFPEs. The upper image shows the AC scores for Msn2, Msn4, and Yap1 inferred from two independent over-expression TFPE data in combination with ChIP-chip data from different conditions. The lower table shows the AC scores as well as the significance levels (in bracket). Rows are different TFPEs for the corresponding TF and columns are different conditions under which the ChIP-chip data for the corresponding TF is measured. The superscript in the first column distinguishes the two independent gene expression profiles. N/A means not available, which is due to the unavailability of the ChIP-chip data for the TF under the condition.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194743&req=5

Figure 3: Consistency of inferred AC scores for TFs in the corresponding over-expression TFPEs. The upper image shows the AC scores for Msn2, Msn4, and Yap1 inferred from two independent over-expression TFPE data in combination with ChIP-chip data from different conditions. The lower table shows the AC scores as well as the significance levels (in bracket). Rows are different TFPEs for the corresponding TF and columns are different conditions under which the ChIP-chip data for the corresponding TF is measured. The superscript in the first column distinguishes the two independent gene expression profiles. N/A means not available, which is due to the unavailability of the ChIP-chip data for the TF under the condition.

Mentions: Both deletion and over-expression TFPE data are available for 6 TFs: Gcn4, Hsf1, Mbp1, Ste12, Swi4 and Yap1, so we examine the consistency of activity inferences for these TFs in the deletion and over-expression TFPEs. As shown in Figure 2, in all except two cases, our method achieves consistent results for TF activity inference. For example, Gcn4, the transcriptional activator of amino acid biosynthetic genes, is inferred to be activated in Gcn4 over-expressed yeast strain (the AC scores are 16.3, 25.6, and 26.2 under YPD, RAPA, and SM condition, respectively) and repressed in gcn4Δ strain (the AC scores are -15.7, -25.8, and -25.7 under YPD, RAPA, and SM condition, respectively). Moreover, over-expressed TFPEs for Msn2, Msn4, and Yap1 have been performed independently by two research groups [11,12]. We examine consistency of the activity inferences from both data sets. As shown in Figure 3, our method achieves similar results for the two independent microarray expression data sets. It should be noted that expression profiles from the two Yap1 over-expression microarray experiments are not significantly similar with each other (the Spearman correlation coefficient is 0.02), perhaps due to high noise introduced during microarray experiments. Nevertheless, the transcriptional inferences for Yap1 from both data sets are still in good consistency, suggesting the robustness of our method to noise in gene expression data.


Inferring activity changes of transcription factors by binding association with sorted expression profiles.

Cheng C, Yan X, Sun F, Li LM - BMC Bioinformatics (2007)

Consistency of inferred AC scores for TFs in the corresponding over-expression TFPEs. The upper image shows the AC scores for Msn2, Msn4, and Yap1 inferred from two independent over-expression TFPE data in combination with ChIP-chip data from different conditions. The lower table shows the AC scores as well as the significance levels (in bracket). Rows are different TFPEs for the corresponding TF and columns are different conditions under which the ChIP-chip data for the corresponding TF is measured. The superscript in the first column distinguishes the two independent gene expression profiles. N/A means not available, which is due to the unavailability of the ChIP-chip data for the TF under the condition.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194743&req=5

Figure 3: Consistency of inferred AC scores for TFs in the corresponding over-expression TFPEs. The upper image shows the AC scores for Msn2, Msn4, and Yap1 inferred from two independent over-expression TFPE data in combination with ChIP-chip data from different conditions. The lower table shows the AC scores as well as the significance levels (in bracket). Rows are different TFPEs for the corresponding TF and columns are different conditions under which the ChIP-chip data for the corresponding TF is measured. The superscript in the first column distinguishes the two independent gene expression profiles. N/A means not available, which is due to the unavailability of the ChIP-chip data for the TF under the condition.
Mentions: Both deletion and over-expression TFPE data are available for 6 TFs: Gcn4, Hsf1, Mbp1, Ste12, Swi4 and Yap1, so we examine the consistency of activity inferences for these TFs in the deletion and over-expression TFPEs. As shown in Figure 2, in all except two cases, our method achieves consistent results for TF activity inference. For example, Gcn4, the transcriptional activator of amino acid biosynthetic genes, is inferred to be activated in Gcn4 over-expressed yeast strain (the AC scores are 16.3, 25.6, and 26.2 under YPD, RAPA, and SM condition, respectively) and repressed in gcn4Δ strain (the AC scores are -15.7, -25.8, and -25.7 under YPD, RAPA, and SM condition, respectively). Moreover, over-expressed TFPEs for Msn2, Msn4, and Yap1 have been performed independently by two research groups [11,12]. We examine consistency of the activity inferences from both data sets. As shown in Figure 3, our method achieves similar results for the two independent microarray expression data sets. It should be noted that expression profiles from the two Yap1 over-expression microarray experiments are not significantly similar with each other (the Spearman correlation coefficient is 0.02), perhaps due to high noise introduced during microarray experiments. Nevertheless, the transcriptional inferences for Yap1 from both data sets are still in good consistency, suggesting the robustness of our method to noise in gene expression data.

Bottom Line: To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively.The implications obtained from all three examples are consistent with established biological results.The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity.

View Article: PubMed Central - HTML - PubMed

Affiliation: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089-2910, USA. chaochen@usc.edu

ABSTRACT

Background: The identification of transcription factors (TFs) associated with a biological process is fundamental to understanding its regulatory mechanisms. From microarray data, however, the activity changes of TFs often cannot be directly observed due to their relatively low expression levels, post-transcriptional modifications, and other complications. Several approaches have been proposed to infer TF activity changes from microarray data. In some models, a linear relationship between gene expression and TF-gene binding strength is assumed. In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes.

Results: We propose a novel method, referred to as BASE (binding association with sorted expression), to infer TF activity changes from microarray expression profiles with the help of binding affinity data. It searches the maximum association between bind affinity profile of a TF and expression change profile along the direction of sorted differentiation. The method does not make hard target gene selection, rather, the significances of TF activity changes are evaluated by permutation tests of binding association at the end. To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively. The implications obtained from all three examples are consistent with established biological results. Moreover, the inferences suggest new and biological meaningful hypotheses for further investigation.

Conclusion: The proposed method makes transcription inference from profiles of expression and binding affinity. The same machinery can be used to deal with various kinds of binding affinity data. The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity. This method is easy to implement and can be routinely applied for transcriptional inferences in microarray studies.

Show MeSH
Related in: MedlinePlus