Limits...
Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models.

Costa IG, Roider HG, do Rego TG, de Carvalho Fde A - BMC Bioinformatics (2011)

Bottom Line: We apply the approach to identify the underlying histone modifications and transcription factors guiding gene expression of differentiated CD4+ T cells.The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches.Moreover, it recovered the known role of the modifications H3K4me3 and H3K27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center of Informatics, Federal University of Pernambuco, Recife, Brazil. igcf@cin.ufpe.br

ABSTRACT

Background: The differentiation process from stem cells to fully differentiated cell types is controlled by the interplay of chromatin modifications and transcription factor activity. Histone modifications or transcription factors frequently act in a multi-functional manner, with a given DNA motif or histone modification conveying both transcriptional repression and activation depending on its location in the promoter and other regulatory signals surrounding it.

Results: To account for the possible multi functionality of regulatory signals, we model the observed gene expression patterns by a mixture of linear regression models. We apply the approach to identify the underlying histone modifications and transcription factors guiding gene expression of differentiated CD4+ T cells. The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches. Moreover, it recovered the known role of the modifications H3K4me3 and H3K27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.

Show MeSH

Related in: MedlinePlus

Regression Prediction Error. We depict the MSE error for 1 to 6 models for the prediction of expression on Th1, Th2, Th17 and iTreg. Bars marked with * indicate number of linear models indicated by the model selection. The MSE with TF is higher than the use of either HM/TF or HM on all combinations of expression data and number of models. For all combination of expression data and number of models, the was no significant difference between the MSE from HM or HM/TF.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3044284&req=5

Figure 2: Regression Prediction Error. We depict the MSE error for 1 to 6 models for the prediction of expression on Th1, Th2, Th17 and iTreg. Bars marked with * indicate number of linear models indicated by the model selection. The MSE with TF is higher than the use of either HM/TF or HM on all combinations of expression data and number of models. For all combination of expression data and number of models, the was no significant difference between the MSE from HM or HM/TF.

Mentions: Predicting the gene expression from the four T-cell types based on only histone modification data and by means of only a single regression model yields MSEs of about 0.5 for HM and HM+TF on all data sets (see red bars in Fig. 2). A mixture of two regression models further reduces the MSEs to an average value 0.25 across all cell types. In all scenarios, the difference of MSE between one and two models were statistically relevant (t-test p-value < 0.01) indicating the advantage of using mixtures to predict expression. The model selection procedure (see Methods) indicates that the data is optimally explained by the combination of 2-4 regression models (see Fig. 2) and that gene expression data can be well predicted based on histone modification data alone.


Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models.

Costa IG, Roider HG, do Rego TG, de Carvalho Fde A - BMC Bioinformatics (2011)

Regression Prediction Error. We depict the MSE error for 1 to 6 models for the prediction of expression on Th1, Th2, Th17 and iTreg. Bars marked with * indicate number of linear models indicated by the model selection. The MSE with TF is higher than the use of either HM/TF or HM on all combinations of expression data and number of models. For all combination of expression data and number of models, the was no significant difference between the MSE from HM or HM/TF.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3044284&req=5

Figure 2: Regression Prediction Error. We depict the MSE error for 1 to 6 models for the prediction of expression on Th1, Th2, Th17 and iTreg. Bars marked with * indicate number of linear models indicated by the model selection. The MSE with TF is higher than the use of either HM/TF or HM on all combinations of expression data and number of models. For all combination of expression data and number of models, the was no significant difference between the MSE from HM or HM/TF.
Mentions: Predicting the gene expression from the four T-cell types based on only histone modification data and by means of only a single regression model yields MSEs of about 0.5 for HM and HM+TF on all data sets (see red bars in Fig. 2). A mixture of two regression models further reduces the MSEs to an average value 0.25 across all cell types. In all scenarios, the difference of MSE between one and two models were statistically relevant (t-test p-value < 0.01) indicating the advantage of using mixtures to predict expression. The model selection procedure (see Methods) indicates that the data is optimally explained by the combination of 2-4 regression models (see Fig. 2) and that gene expression data can be well predicted based on histone modification data alone.

Bottom Line: We apply the approach to identify the underlying histone modifications and transcription factors guiding gene expression of differentiated CD4+ T cells.The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches.Moreover, it recovered the known role of the modifications H3K4me3 and H3K27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center of Informatics, Federal University of Pernambuco, Recife, Brazil. igcf@cin.ufpe.br

ABSTRACT

Background: The differentiation process from stem cells to fully differentiated cell types is controlled by the interplay of chromatin modifications and transcription factor activity. Histone modifications or transcription factors frequently act in a multi-functional manner, with a given DNA motif or histone modification conveying both transcriptional repression and activation depending on its location in the promoter and other regulatory signals surrounding it.

Results: To account for the possible multi functionality of regulatory signals, we model the observed gene expression patterns by a mixture of linear regression models. We apply the approach to identify the underlying histone modifications and transcription factors guiding gene expression of differentiated CD4+ T cells. The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches. Moreover, it recovered the known role of the modifications H3K4me3 and H3K27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.

Show MeSH
Related in: MedlinePlus