Limits...
Genome-wide polycomb target gene prediction in Drosophila melanogaster.

Zeng J, Kirk BD, Gou Y, Wang Q, Ma J - Nucleic Acids Res. (2012)

Bottom Line: Our data suggest that multiple transcription factor networking at the cis-regulatory elements is critical for PcG recruitment, while high GC content and high conservation level are also important features of PcG target genes.EpiPredictor should substantially expedite experimental discovery of PcG target genes by providing an effective initial screening tool.From a computational standpoint, our strategy of modelling transcription factor interaction with a non-linear kernel is original, effective and transferable to many other applications.

View Article: PubMed Central - PubMed

Affiliation: Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.

ABSTRACT
As key epigenetic regulators, polycomb group (PcG) proteins are responsible for the control of cell proliferation and differentiation as well as stem cell pluripotency and self-renewal. Aberrant epigenetic modification by PcG is strongly correlated with the severity and invasiveness of many types of cancers. Unfortunately, the molecular mechanism of PcG-mediated epigenetic regulation remained elusive, partly due to the extremely limited pool of experimentally confirmed PcG target genes. In order to facilitate experimental identification of PcG target genes, here we propose a novel computational method, EpiPredictor, that achieved significantly higher matching ratios with several recent chromatin immunoprecipitation studies than jPREdictor, an existing computational method. We further validated a subset of genes that were uniquely predicted by EpiPredictor by cross-referencing existing literature and by experimental means. Our data suggest that multiple transcription factor networking at the cis-regulatory elements is critical for PcG recruitment, while high GC content and high conservation level are also important features of PcG target genes. EpiPredictor should substantially expedite experimental discovery of PcG target genes by providing an effective initial screening tool. From a computational standpoint, our strategy of modelling transcription factor interaction with a non-linear kernel is original, effective and transferable to many other applications.

Show MeSH

Related in: MedlinePlus

ROC curves of the PRE genes predicted by EpiPredictor and jPREdictor. Shown are overlaps with the genes predicted by Schwartz et al. (A), Tolhuis et al. (B), Schuettengruber et al. (C) and the genes intersected by all three sets (D). The AUCs on the four validation sets are 0.61, 0.61, 0.58 and 0.60, respectively, for EpiPredictor-Basic, 0.62, 0.57, 0.62 and 0.53, respectively, for EpiPredictor-CG, 0.64, 0.56, 0.59 and 0.67 for jPREdictor (static), 0.56, 0.49, 0.55 and 0.59 for jPREdictor (dynamic).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3401425&req=5

gks209-F2: ROC curves of the PRE genes predicted by EpiPredictor and jPREdictor. Shown are overlaps with the genes predicted by Schwartz et al. (A), Tolhuis et al. (B), Schuettengruber et al. (C) and the genes intersected by all three sets (D). The AUCs on the four validation sets are 0.61, 0.61, 0.58 and 0.60, respectively, for EpiPredictor-Basic, 0.62, 0.57, 0.62 and 0.53, respectively, for EpiPredictor-CG, 0.64, 0.56, 0.59 and 0.67 for jPREdictor (static), 0.56, 0.49, 0.55 and 0.59 for jPREdictor (dynamic).

Mentions: We conducted a comparative analysis of EpiPredictor and jPREdictor (Table 4) by using the matching ratios as well as the receiving operating characteristics (ROC) curve as our evaluation metrics. The former metric indicates the overall accuracy of prediction while the latter one depicts the trade-off between sensitivity and specificity, which focuses on evaluating the ranking scheme. In terms of the matching ratio, EpiPredictor-Basic outperformed jPREdictor (static) by 6.25, 2.67, 6.05, 5.27%, respectively, against the three validation sets and their intersection set and the improvement is statistically significant (P < 0.05 in one-tailed Students’ t-test). In addition, EpiPredictor-CG surpassed the performance of jPREdictor (dynamic) by 7.96, 2.67, 10.23, 18.42%, respectively (P < 0.05). In terms of the area under curve (AUC) of ROC curve, EpiPredictor-Basic achieved comparable results with jPREdictor (static), whereas EpiPredictor-CG outperformed jPREdictor (dynamic) in three out of the four cases (Figure 2). It is worth noting that the AUCs of EpiPredictor-Basic, EpiPredictor-CG and jPREdictor (static) were all significantly larger than 0.5 (random guess) (P < 0.05) but it was not the case for jPREdictor (dynamic). Furthermore, using the AUCs as a measure, neither EpiPredictor nor jPREdictor’s advanced version significantly outperformed their basic counterpart.Figure 2.


Genome-wide polycomb target gene prediction in Drosophila melanogaster.

Zeng J, Kirk BD, Gou Y, Wang Q, Ma J - Nucleic Acids Res. (2012)

ROC curves of the PRE genes predicted by EpiPredictor and jPREdictor. Shown are overlaps with the genes predicted by Schwartz et al. (A), Tolhuis et al. (B), Schuettengruber et al. (C) and the genes intersected by all three sets (D). The AUCs on the four validation sets are 0.61, 0.61, 0.58 and 0.60, respectively, for EpiPredictor-Basic, 0.62, 0.57, 0.62 and 0.53, respectively, for EpiPredictor-CG, 0.64, 0.56, 0.59 and 0.67 for jPREdictor (static), 0.56, 0.49, 0.55 and 0.59 for jPREdictor (dynamic).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3401425&req=5

gks209-F2: ROC curves of the PRE genes predicted by EpiPredictor and jPREdictor. Shown are overlaps with the genes predicted by Schwartz et al. (A), Tolhuis et al. (B), Schuettengruber et al. (C) and the genes intersected by all three sets (D). The AUCs on the four validation sets are 0.61, 0.61, 0.58 and 0.60, respectively, for EpiPredictor-Basic, 0.62, 0.57, 0.62 and 0.53, respectively, for EpiPredictor-CG, 0.64, 0.56, 0.59 and 0.67 for jPREdictor (static), 0.56, 0.49, 0.55 and 0.59 for jPREdictor (dynamic).
Mentions: We conducted a comparative analysis of EpiPredictor and jPREdictor (Table 4) by using the matching ratios as well as the receiving operating characteristics (ROC) curve as our evaluation metrics. The former metric indicates the overall accuracy of prediction while the latter one depicts the trade-off between sensitivity and specificity, which focuses on evaluating the ranking scheme. In terms of the matching ratio, EpiPredictor-Basic outperformed jPREdictor (static) by 6.25, 2.67, 6.05, 5.27%, respectively, against the three validation sets and their intersection set and the improvement is statistically significant (P < 0.05 in one-tailed Students’ t-test). In addition, EpiPredictor-CG surpassed the performance of jPREdictor (dynamic) by 7.96, 2.67, 10.23, 18.42%, respectively (P < 0.05). In terms of the area under curve (AUC) of ROC curve, EpiPredictor-Basic achieved comparable results with jPREdictor (static), whereas EpiPredictor-CG outperformed jPREdictor (dynamic) in three out of the four cases (Figure 2). It is worth noting that the AUCs of EpiPredictor-Basic, EpiPredictor-CG and jPREdictor (static) were all significantly larger than 0.5 (random guess) (P < 0.05) but it was not the case for jPREdictor (dynamic). Furthermore, using the AUCs as a measure, neither EpiPredictor nor jPREdictor’s advanced version significantly outperformed their basic counterpart.Figure 2.

Bottom Line: Our data suggest that multiple transcription factor networking at the cis-regulatory elements is critical for PcG recruitment, while high GC content and high conservation level are also important features of PcG target genes.EpiPredictor should substantially expedite experimental discovery of PcG target genes by providing an effective initial screening tool.From a computational standpoint, our strategy of modelling transcription factor interaction with a non-linear kernel is original, effective and transferable to many other applications.

View Article: PubMed Central - PubMed

Affiliation: Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.

ABSTRACT
As key epigenetic regulators, polycomb group (PcG) proteins are responsible for the control of cell proliferation and differentiation as well as stem cell pluripotency and self-renewal. Aberrant epigenetic modification by PcG is strongly correlated with the severity and invasiveness of many types of cancers. Unfortunately, the molecular mechanism of PcG-mediated epigenetic regulation remained elusive, partly due to the extremely limited pool of experimentally confirmed PcG target genes. In order to facilitate experimental identification of PcG target genes, here we propose a novel computational method, EpiPredictor, that achieved significantly higher matching ratios with several recent chromatin immunoprecipitation studies than jPREdictor, an existing computational method. We further validated a subset of genes that were uniquely predicted by EpiPredictor by cross-referencing existing literature and by experimental means. Our data suggest that multiple transcription factor networking at the cis-regulatory elements is critical for PcG recruitment, while high GC content and high conservation level are also important features of PcG target genes. EpiPredictor should substantially expedite experimental discovery of PcG target genes by providing an effective initial screening tool. From a computational standpoint, our strategy of modelling transcription factor interaction with a non-linear kernel is original, effective and transferable to many other applications.

Show MeSH
Related in: MedlinePlus