Limits...
Finding subtypes of transcription factor motif pairs with distinct regulatory roles.

Bais AS, Kaminski N, Benos PV - Nucleic Acids Res. (2011)

Bottom Line: DNA sequences bound by a transcription factor (TF) are presumed to contain sequence elements that reflect its DNA binding preferences and its downstream-regulatory effects.We present DiSCo (Discovery of Subtypes and Cofactors), a novel approach for identifying variants of dyad motifs (and their respective target sequence sets) that are instrumental for differential downstream regulation.Using both simulated and experimental datasets, we demonstrate how current motif discovery can be successfully leveraged to address this question.

View Article: PubMed Central - PubMed

Affiliation: Department of Computational and Systems Biology, Dorothy P. and Richard P. Simmons Center for Interstitial Lung Disease, Division of Pulmonary, Allergy and Critical Care Medicine and Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15260, USA.

ABSTRACT
DNA sequences bound by a transcription factor (TF) are presumed to contain sequence elements that reflect its DNA binding preferences and its downstream-regulatory effects. Experimentally identified TF binding sites (TFBSs) are usually similar enough to be summarized by a 'consensus' motif, representative of the TF DNA binding specificity. Studies have shown that groups of nucleotide TFBS variants (subtypes) can contribute to distinct modes of downstream regulation by the TF via differential recruitment of cofactors. A TF(A) may bind to TFBS subtypes a(1) or a(2) depending on whether it associates with cofactors TF(B) or TF(C), respectively. While some approaches can discover motif pairs (dyads), none address the problem of identifying 'variants' of dyads. TFs are key components of multiple regulatory pathways targeting different sets of genes perhaps with different binding preferences. Identifying the discriminating TF-DNA associations that lead to the differential downstream regulation is thus essential. We present DiSCo (Discovery of Subtypes and Cofactors), a novel approach for identifying variants of dyad motifs (and their respective target sequence sets) that are instrumental for differential downstream regulation. Using both simulated and experimental datasets, we demonstrate how current motif discovery can be successfully leveraged to address this question.

Show MeSH

Related in: MedlinePlus

Motifs found on CRP-N and CRP-S sequences of H. influenzae. The dyads discovered in majority polled run after SDD (top row) and DiSCo (bottom two rows) are shown. SDD yields a dyad whose components have clustered together the half-sites of both types of H. influenzae CRP motifs (Figure 3A and B). DiSCo however is able to successfully identify two clusters C1 and C2 whereby C1 is enriched with a dyad similar to the H. influenzae CRP-N motif (Figure 3A) and C2 with one that resembles the H. influenzae CRP-S motif (Figure 3B). Average misclassification error = 0.23.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3113591&req=5

Figure 5: Motifs found on CRP-N and CRP-S sequences of H. influenzae. The dyads discovered in majority polled run after SDD (top row) and DiSCo (bottom two rows) are shown. SDD yields a dyad whose components have clustered together the half-sites of both types of H. influenzae CRP motifs (Figure 3A and B). DiSCo however is able to successfully identify two clusters C1 and C2 whereby C1 is enriched with a dyad similar to the H. influenzae CRP-N motif (Figure 3A) and C2 with one that resembles the H. influenzae CRP-S motif (Figure 3B). Average misclassification error = 0.23.

Mentions: Next, we analyzed the pooled CRP-N and CRP-S related sequences in H. influenzae with resulting motifs shown in Figure 5 (both strands searched for motif pairs of widths 8 each at a distance of 6 bp). Here, besides the difference in the cores, the differences in the surrounding regions of the two half-sites in CRP-S and CRP-N are also prominent. The dyad motif discovered after SDD matches the CRP-N motif of H. influenzae, which might be due to the greater number of CRP-N related sequences (41) as compared to CRP-S related sequences (13). The two dyads DiSCo yields clearly resembling the two motifs. The CRP-S sequences again are clearly clustered together yielding a strong CRP-S like motif. Hence, despite the two half-sites differing in both sets, DiSCo successfully partitions the sequences into those enriched with the specific subtype of CRP hetero-dimers.Figure 5.


Finding subtypes of transcription factor motif pairs with distinct regulatory roles.

Bais AS, Kaminski N, Benos PV - Nucleic Acids Res. (2011)

Motifs found on CRP-N and CRP-S sequences of H. influenzae. The dyads discovered in majority polled run after SDD (top row) and DiSCo (bottom two rows) are shown. SDD yields a dyad whose components have clustered together the half-sites of both types of H. influenzae CRP motifs (Figure 3A and B). DiSCo however is able to successfully identify two clusters C1 and C2 whereby C1 is enriched with a dyad similar to the H. influenzae CRP-N motif (Figure 3A) and C2 with one that resembles the H. influenzae CRP-S motif (Figure 3B). Average misclassification error = 0.23.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3113591&req=5

Figure 5: Motifs found on CRP-N and CRP-S sequences of H. influenzae. The dyads discovered in majority polled run after SDD (top row) and DiSCo (bottom two rows) are shown. SDD yields a dyad whose components have clustered together the half-sites of both types of H. influenzae CRP motifs (Figure 3A and B). DiSCo however is able to successfully identify two clusters C1 and C2 whereby C1 is enriched with a dyad similar to the H. influenzae CRP-N motif (Figure 3A) and C2 with one that resembles the H. influenzae CRP-S motif (Figure 3B). Average misclassification error = 0.23.
Mentions: Next, we analyzed the pooled CRP-N and CRP-S related sequences in H. influenzae with resulting motifs shown in Figure 5 (both strands searched for motif pairs of widths 8 each at a distance of 6 bp). Here, besides the difference in the cores, the differences in the surrounding regions of the two half-sites in CRP-S and CRP-N are also prominent. The dyad motif discovered after SDD matches the CRP-N motif of H. influenzae, which might be due to the greater number of CRP-N related sequences (41) as compared to CRP-S related sequences (13). The two dyads DiSCo yields clearly resembling the two motifs. The CRP-S sequences again are clearly clustered together yielding a strong CRP-S like motif. Hence, despite the two half-sites differing in both sets, DiSCo successfully partitions the sequences into those enriched with the specific subtype of CRP hetero-dimers.Figure 5.

Bottom Line: DNA sequences bound by a transcription factor (TF) are presumed to contain sequence elements that reflect its DNA binding preferences and its downstream-regulatory effects.We present DiSCo (Discovery of Subtypes and Cofactors), a novel approach for identifying variants of dyad motifs (and their respective target sequence sets) that are instrumental for differential downstream regulation.Using both simulated and experimental datasets, we demonstrate how current motif discovery can be successfully leveraged to address this question.

View Article: PubMed Central - PubMed

Affiliation: Department of Computational and Systems Biology, Dorothy P. and Richard P. Simmons Center for Interstitial Lung Disease, Division of Pulmonary, Allergy and Critical Care Medicine and Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15260, USA.

ABSTRACT
DNA sequences bound by a transcription factor (TF) are presumed to contain sequence elements that reflect its DNA binding preferences and its downstream-regulatory effects. Experimentally identified TF binding sites (TFBSs) are usually similar enough to be summarized by a 'consensus' motif, representative of the TF DNA binding specificity. Studies have shown that groups of nucleotide TFBS variants (subtypes) can contribute to distinct modes of downstream regulation by the TF via differential recruitment of cofactors. A TF(A) may bind to TFBS subtypes a(1) or a(2) depending on whether it associates with cofactors TF(B) or TF(C), respectively. While some approaches can discover motif pairs (dyads), none address the problem of identifying 'variants' of dyads. TFs are key components of multiple regulatory pathways targeting different sets of genes perhaps with different binding preferences. Identifying the discriminating TF-DNA associations that lead to the differential downstream regulation is thus essential. We present DiSCo (Discovery of Subtypes and Cofactors), a novel approach for identifying variants of dyad motifs (and their respective target sequence sets) that are instrumental for differential downstream regulation. Using both simulated and experimental datasets, we demonstrate how current motif discovery can be successfully leveraged to address this question.

Show MeSH
Related in: MedlinePlus