Limits...
A probabilistic approach to learn chromatin architecture and accurate inference of the NF-κB/RelA regulatory network using ChIP-Seq.

Yang J, Mitra A, Dojer N, Fu S, Rowicka M, Brasier AR - Nucleic Acids Res. (2013)

Bottom Line: Sixteen novel NF-κB/RelA-regulated genes and TFBSs were experimentally validated, including TANK, a negative feedback gene whose expression is NF-κB/RelA dependent and requires a functional interaction with the AP1 TFBSs.Our probabilistic method yields more accurate NF-κB/RelA-regulated networks than a traditional, distance-based approach, confirmed by both analysis of gene expression and increased informativity of Genome Ontology annotations.Our analysis provides new insights into how co-occurring TFBSs and local chromatin context orchestrate activation of NF-κB/RelA sub-pathways differing in biological function and temporal expression patterns.

View Article: PubMed Central - PubMed

Affiliation: Department of Internal Medicine, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA, Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA, Institute for Translational Sciences, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA, Institute of Informatics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland and Sealy Center for Molecular Medicine, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA.

ABSTRACT
Using nuclear factor-κB (NF-κB) ChIP-Seq data, we present a framework for iterative learning of regulatory networks. For every possible transcription factor-binding site (TFBS)-putatively regulated gene pair, the relative distance and orientation are calculated to learn which TFBSs are most likely to regulate a given gene. Weighted TFBS contributions to putative gene regulation are integrated to derive an NF-κB gene network. A de novo motif enrichment analysis uncovers secondary TFBSs (AP1, SP1) at characteristic distances from NF-κB/RelA TFBSs. Comparison with experimental ENCODE ChIP-Seq data indicates that experimental TFBSs highly correlate with predicted sites. We observe that RelA-SP1-enriched promoters have distinct expression profiles from that of RelA-AP1 and are enriched in introns, CpG islands and DNase accessible sites. Sixteen novel NF-κB/RelA-regulated genes and TFBSs were experimentally validated, including TANK, a negative feedback gene whose expression is NF-κB/RelA dependent and requires a functional interaction with the AP1 TFBSs. Our probabilistic method yields more accurate NF-κB/RelA-regulated networks than a traditional, distance-based approach, confirmed by both analysis of gene expression and increased informativity of Genome Ontology annotations. Our analysis provides new insights into how co-occurring TFBSs and local chromatin context orchestrate activation of NF-κB/RelA sub-pathways differing in biological function and temporal expression patterns.

Show MeSH
Motif enrichment. Shown are the weblogos for four primary cis-regulatory motifs of 12-nt length identified in the MACS peaks (a–d, motifs of NF-κB/RelA, AP1, gapped SP1, break point and their validation histogram, respectively). The blue line depicts the distribution of the likelihood scores (based on log-transformed position weight matrix) for a given motif to occur among the top 20% of the RelA Chip-Seq peaks ChIP peaks, and the red line depicts the same distribution for the whole genome. For each motif, the validation histogram is plotted, which clearly shows that the given motif is genuinely enriched in RelA peaks as compared with the whole human genome. The motifs denoted as depleted in Table 2 do not pass this test.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3753626&req=5

gkt493-F4: Motif enrichment. Shown are the weblogos for four primary cis-regulatory motifs of 12-nt length identified in the MACS peaks (a–d, motifs of NF-κB/RelA, AP1, gapped SP1, break point and their validation histogram, respectively). The blue line depicts the distribution of the likelihood scores (based on log-transformed position weight matrix) for a given motif to occur among the top 20% of the RelA Chip-Seq peaks ChIP peaks, and the red line depicts the same distribution for the whole genome. For each motif, the validation histogram is plotted, which clearly shows that the given motif is genuinely enriched in RelA peaks as compared with the whole human genome. The motifs denoted as depleted in Table 2 do not pass this test.

Mentions: First, we searched for 12-nt motifs among the 500-nt sequences centered on the NF-κB/RelA peak summit in the top 20% of the rank-ordered MACS peaks. Strikingly, MEME consistently found statistically significant occurrence of the canonical NF-κB/RelA motif (Figure 4A) for all 4195 analyzed peaks. Using P-value corrected based on actual genomic frequency of each motif (Table 2), high-confidence motifs were identified in 3202 peaks. Because we have only analyzed in detail ∼20% of the peaks, this result indicates there are up to ∼16 000 high-quality NF-κB/RelA-binding sites among regions revealed as differentially bound in our XChIP-Seq data set. The most significantly enriched sequence motif is 5′-KGGRNTTTCCM-3′ (Figure 4A, top panel), a sequence that corresponds closely with the consensus NF-κB/RelA sequence of 5′-GGGRNTTTCC-3′ identified by an in vitro selection technique from a degenerate oligonucleotide library (19). This motif is truly enriched in the RelA ChIP-Seq peaks (as opposed to having significant theoretical MEME E-value), as confirmed by comparison of likelihoods of this motif occurrences in these peaks (blue histogram, Figure 4A) and in the human genome (red histogram, Figure 4A).Figure 4.


A probabilistic approach to learn chromatin architecture and accurate inference of the NF-κB/RelA regulatory network using ChIP-Seq.

Yang J, Mitra A, Dojer N, Fu S, Rowicka M, Brasier AR - Nucleic Acids Res. (2013)

Motif enrichment. Shown are the weblogos for four primary cis-regulatory motifs of 12-nt length identified in the MACS peaks (a–d, motifs of NF-κB/RelA, AP1, gapped SP1, break point and their validation histogram, respectively). The blue line depicts the distribution of the likelihood scores (based on log-transformed position weight matrix) for a given motif to occur among the top 20% of the RelA Chip-Seq peaks ChIP peaks, and the red line depicts the same distribution for the whole genome. For each motif, the validation histogram is plotted, which clearly shows that the given motif is genuinely enriched in RelA peaks as compared with the whole human genome. The motifs denoted as depleted in Table 2 do not pass this test.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3753626&req=5

gkt493-F4: Motif enrichment. Shown are the weblogos for four primary cis-regulatory motifs of 12-nt length identified in the MACS peaks (a–d, motifs of NF-κB/RelA, AP1, gapped SP1, break point and their validation histogram, respectively). The blue line depicts the distribution of the likelihood scores (based on log-transformed position weight matrix) for a given motif to occur among the top 20% of the RelA Chip-Seq peaks ChIP peaks, and the red line depicts the same distribution for the whole genome. For each motif, the validation histogram is plotted, which clearly shows that the given motif is genuinely enriched in RelA peaks as compared with the whole human genome. The motifs denoted as depleted in Table 2 do not pass this test.
Mentions: First, we searched for 12-nt motifs among the 500-nt sequences centered on the NF-κB/RelA peak summit in the top 20% of the rank-ordered MACS peaks. Strikingly, MEME consistently found statistically significant occurrence of the canonical NF-κB/RelA motif (Figure 4A) for all 4195 analyzed peaks. Using P-value corrected based on actual genomic frequency of each motif (Table 2), high-confidence motifs were identified in 3202 peaks. Because we have only analyzed in detail ∼20% of the peaks, this result indicates there are up to ∼16 000 high-quality NF-κB/RelA-binding sites among regions revealed as differentially bound in our XChIP-Seq data set. The most significantly enriched sequence motif is 5′-KGGRNTTTCCM-3′ (Figure 4A, top panel), a sequence that corresponds closely with the consensus NF-κB/RelA sequence of 5′-GGGRNTTTCC-3′ identified by an in vitro selection technique from a degenerate oligonucleotide library (19). This motif is truly enriched in the RelA ChIP-Seq peaks (as opposed to having significant theoretical MEME E-value), as confirmed by comparison of likelihoods of this motif occurrences in these peaks (blue histogram, Figure 4A) and in the human genome (red histogram, Figure 4A).Figure 4.

Bottom Line: Sixteen novel NF-κB/RelA-regulated genes and TFBSs were experimentally validated, including TANK, a negative feedback gene whose expression is NF-κB/RelA dependent and requires a functional interaction with the AP1 TFBSs.Our probabilistic method yields more accurate NF-κB/RelA-regulated networks than a traditional, distance-based approach, confirmed by both analysis of gene expression and increased informativity of Genome Ontology annotations.Our analysis provides new insights into how co-occurring TFBSs and local chromatin context orchestrate activation of NF-κB/RelA sub-pathways differing in biological function and temporal expression patterns.

View Article: PubMed Central - PubMed

Affiliation: Department of Internal Medicine, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA, Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA, Institute for Translational Sciences, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA, Institute of Informatics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland and Sealy Center for Molecular Medicine, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-1060, USA.

ABSTRACT
Using nuclear factor-κB (NF-κB) ChIP-Seq data, we present a framework for iterative learning of regulatory networks. For every possible transcription factor-binding site (TFBS)-putatively regulated gene pair, the relative distance and orientation are calculated to learn which TFBSs are most likely to regulate a given gene. Weighted TFBS contributions to putative gene regulation are integrated to derive an NF-κB gene network. A de novo motif enrichment analysis uncovers secondary TFBSs (AP1, SP1) at characteristic distances from NF-κB/RelA TFBSs. Comparison with experimental ENCODE ChIP-Seq data indicates that experimental TFBSs highly correlate with predicted sites. We observe that RelA-SP1-enriched promoters have distinct expression profiles from that of RelA-AP1 and are enriched in introns, CpG islands and DNase accessible sites. Sixteen novel NF-κB/RelA-regulated genes and TFBSs were experimentally validated, including TANK, a negative feedback gene whose expression is NF-κB/RelA dependent and requires a functional interaction with the AP1 TFBSs. Our probabilistic method yields more accurate NF-κB/RelA-regulated networks than a traditional, distance-based approach, confirmed by both analysis of gene expression and increased informativity of Genome Ontology annotations. Our analysis provides new insights into how co-occurring TFBSs and local chromatin context orchestrate activation of NF-κB/RelA sub-pathways differing in biological function and temporal expression patterns.

Show MeSH