Limits...
Evolutionary Conserved Motif Finder (ECMFinder) for genome-wide identification of clustered YY1- and CTCF-binding sites.

Kang K, Chung JH, Kim J - Nucleic Acids Res. (2009)

Bottom Line: This program successfully identified many clustered YY1- and CTCF-binding sites that are conserved among these species but were previously undetected.In particular, this program identified CTCF-binding sites that are located close to the Dlk1, Magel2 and Cdkn1c imprinted genes.Individual ChIP experiments confirmed the in vivo binding of the YY1 and CTCF proteins to most of these newly discovered binding sites, demonstrating the feasibility and usefulness of ECMFinder.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.

ABSTRACT
We have developed a new bioinformatics approach called ECMFinder (Evolutionary Conserved Motif Finder). This program searches for a given DNA motif within the entire genome of one species and uses the gene association information of a potential transcription factor-binding site (TFBS) to screen the homologous regions of a second and third species. If multiple species have this potential TFBS in homologous positions, this program recognizes the identified TFBS as an evolutionary conserved motif (ECM). This program outputs a list of ECMs, which can be uploaded as a Custom Track in the UCSC genome browser and can be visualized along with other available data. The feasibility of this approach was tested by searching the genomes of three mammals (human, mouse and cow) with the DNA-binding motifs of YY1 and CTCF. This program successfully identified many clustered YY1- and CTCF-binding sites that are conserved among these species but were previously undetected. In particular, this program identified CTCF-binding sites that are located close to the Dlk1, Magel2 and Cdkn1c imprinted genes. Individual ChIP experiments confirmed the in vivo binding of the YY1 and CTCF proteins to most of these newly discovered binding sites, demonstrating the feasibility and usefulness of ECMFinder.

Show MeSH
Visualization and in vivo confirmation of the clustered YY1-binding sites predicted by ECMFinder. (A) Clusters of YY1-binding sites located within the 1st intron of Peg3 were visualized along with other data using the Custom Track. Each cluster of conserved YY1-binding sites detected by ECMFinder is indicated by a thick black line in the top track. The following tracks are provided from the UCSC genome browser (RefSeq for gene annotations, PhastCons for conservation, RepeatMasker for repeat elements, HMR conserved Transcription Factor Binding Sites for predicted TFBSs). In the human box, the HMR (human, mouse and rat) conserved TFBSs method using PWM matrix failed to predict the presence of the YY1-binding sites due to the low conservation level of this region. However, all three species have clustered YY1-binding sites within the 1st intron of Peg3. (B) YY1–ChIP results of candidate genes. This series of YY1–ChIP analysis were performed to confirm the in vivo binding of YY1 to each locus predicted by ECMFinder. The amplified PCR products from each locus are shown in the following order: the Input (lane 1), the IgG lane with rabbit normal serum (lane 2) and the YY1 Ab lane with YY1 antibody (lane 3). The two previously known YY1-binding sites were used as positive controls (Nr3c1 and Peg3, blue), whereas three YY1-unrelated loci were used as negative controls (H19-ICR, the promoter region of Rcor3, and the exon region of Ppil2, red). We tested seven randomly chosen loci out of the 31 predicted YY1 clustered regions, including Akt1s1, Fiz1, Prkcsh, Psmb5, Rsrc2, Sfrs10 and Sox4.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2665242&req=5

Figure 2: Visualization and in vivo confirmation of the clustered YY1-binding sites predicted by ECMFinder. (A) Clusters of YY1-binding sites located within the 1st intron of Peg3 were visualized along with other data using the Custom Track. Each cluster of conserved YY1-binding sites detected by ECMFinder is indicated by a thick black line in the top track. The following tracks are provided from the UCSC genome browser (RefSeq for gene annotations, PhastCons for conservation, RepeatMasker for repeat elements, HMR conserved Transcription Factor Binding Sites for predicted TFBSs). In the human box, the HMR (human, mouse and rat) conserved TFBSs method using PWM matrix failed to predict the presence of the YY1-binding sites due to the low conservation level of this region. However, all three species have clustered YY1-binding sites within the 1st intron of Peg3. (B) YY1–ChIP results of candidate genes. This series of YY1–ChIP analysis were performed to confirm the in vivo binding of YY1 to each locus predicted by ECMFinder. The amplified PCR products from each locus are shown in the following order: the Input (lane 1), the IgG lane with rabbit normal serum (lane 2) and the YY1 Ab lane with YY1 antibody (lane 3). The two previously known YY1-binding sites were used as positive controls (Nr3c1 and Peg3, blue), whereas three YY1-unrelated loci were used as negative controls (H19-ICR, the promoter region of Rcor3, and the exon region of Ppil2, red). We tested seven randomly chosen loci out of the 31 predicted YY1 clustered regions, including Akt1s1, Fiz1, Prkcsh, Psmb5, Rsrc2, Sfrs10 and Sox4.

Mentions: We have used the following criteria to perform a genome-wide search of clustered YY1-binding sites: the input motif for YY1-binding sites was ‘cgCCATntt’ with one allowable mismatch within the bases indicated in lowercase; one cluster was defined as the presence of three YY1 motifs in a 500-bp window; the search was performed within the genomic region spanning 5-kb upstream and downstream of each gene's TSS; and three species (human, mouse and cow) were used to test evolutionary conservation. With these criteria, ECMFinder identified a total of 31 candidate genes that have at least one ECM in all three species (Supplementary Table 1). These ECMs were uploaded to the UCSC Genome Browser as a Custom Track, and the Paternally Expressed Gene 3 (Peg3) locus is shown as a representative locus in Figure 2A (http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&hgt.customText=http://jookimlab.lsu.edu/sites/default/files/yy1_data.txt). The 1st intron of mouse Peg3 is known to contain at least 10 YY1-binding sites (14), which were indeed successfully detected by ECMFinder. The thick black bars represent the YY1 ECMs identified within the Peg3's 1st intron (Figure 2A). Similar YY1 ECMs are also found within the 1st introns of human and cow PEG3. Yet, the 1st intron of mammalian Peg3 shows almost no sequence conservation as seen in the graphs derived from PhastCons analysis. This lack of sequence conservation reflects the fact that although each individual YY1 motif is conserved they differ in number and spacing between different mammals.Figure 2.


Evolutionary Conserved Motif Finder (ECMFinder) for genome-wide identification of clustered YY1- and CTCF-binding sites.

Kang K, Chung JH, Kim J - Nucleic Acids Res. (2009)

Visualization and in vivo confirmation of the clustered YY1-binding sites predicted by ECMFinder. (A) Clusters of YY1-binding sites located within the 1st intron of Peg3 were visualized along with other data using the Custom Track. Each cluster of conserved YY1-binding sites detected by ECMFinder is indicated by a thick black line in the top track. The following tracks are provided from the UCSC genome browser (RefSeq for gene annotations, PhastCons for conservation, RepeatMasker for repeat elements, HMR conserved Transcription Factor Binding Sites for predicted TFBSs). In the human box, the HMR (human, mouse and rat) conserved TFBSs method using PWM matrix failed to predict the presence of the YY1-binding sites due to the low conservation level of this region. However, all three species have clustered YY1-binding sites within the 1st intron of Peg3. (B) YY1–ChIP results of candidate genes. This series of YY1–ChIP analysis were performed to confirm the in vivo binding of YY1 to each locus predicted by ECMFinder. The amplified PCR products from each locus are shown in the following order: the Input (lane 1), the IgG lane with rabbit normal serum (lane 2) and the YY1 Ab lane with YY1 antibody (lane 3). The two previously known YY1-binding sites were used as positive controls (Nr3c1 and Peg3, blue), whereas three YY1-unrelated loci were used as negative controls (H19-ICR, the promoter region of Rcor3, and the exon region of Ppil2, red). We tested seven randomly chosen loci out of the 31 predicted YY1 clustered regions, including Akt1s1, Fiz1, Prkcsh, Psmb5, Rsrc2, Sfrs10 and Sox4.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2665242&req=5

Figure 2: Visualization and in vivo confirmation of the clustered YY1-binding sites predicted by ECMFinder. (A) Clusters of YY1-binding sites located within the 1st intron of Peg3 were visualized along with other data using the Custom Track. Each cluster of conserved YY1-binding sites detected by ECMFinder is indicated by a thick black line in the top track. The following tracks are provided from the UCSC genome browser (RefSeq for gene annotations, PhastCons for conservation, RepeatMasker for repeat elements, HMR conserved Transcription Factor Binding Sites for predicted TFBSs). In the human box, the HMR (human, mouse and rat) conserved TFBSs method using PWM matrix failed to predict the presence of the YY1-binding sites due to the low conservation level of this region. However, all three species have clustered YY1-binding sites within the 1st intron of Peg3. (B) YY1–ChIP results of candidate genes. This series of YY1–ChIP analysis were performed to confirm the in vivo binding of YY1 to each locus predicted by ECMFinder. The amplified PCR products from each locus are shown in the following order: the Input (lane 1), the IgG lane with rabbit normal serum (lane 2) and the YY1 Ab lane with YY1 antibody (lane 3). The two previously known YY1-binding sites were used as positive controls (Nr3c1 and Peg3, blue), whereas three YY1-unrelated loci were used as negative controls (H19-ICR, the promoter region of Rcor3, and the exon region of Ppil2, red). We tested seven randomly chosen loci out of the 31 predicted YY1 clustered regions, including Akt1s1, Fiz1, Prkcsh, Psmb5, Rsrc2, Sfrs10 and Sox4.
Mentions: We have used the following criteria to perform a genome-wide search of clustered YY1-binding sites: the input motif for YY1-binding sites was ‘cgCCATntt’ with one allowable mismatch within the bases indicated in lowercase; one cluster was defined as the presence of three YY1 motifs in a 500-bp window; the search was performed within the genomic region spanning 5-kb upstream and downstream of each gene's TSS; and three species (human, mouse and cow) were used to test evolutionary conservation. With these criteria, ECMFinder identified a total of 31 candidate genes that have at least one ECM in all three species (Supplementary Table 1). These ECMs were uploaded to the UCSC Genome Browser as a Custom Track, and the Paternally Expressed Gene 3 (Peg3) locus is shown as a representative locus in Figure 2A (http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&hgt.customText=http://jookimlab.lsu.edu/sites/default/files/yy1_data.txt). The 1st intron of mouse Peg3 is known to contain at least 10 YY1-binding sites (14), which were indeed successfully detected by ECMFinder. The thick black bars represent the YY1 ECMs identified within the Peg3's 1st intron (Figure 2A). Similar YY1 ECMs are also found within the 1st introns of human and cow PEG3. Yet, the 1st intron of mammalian Peg3 shows almost no sequence conservation as seen in the graphs derived from PhastCons analysis. This lack of sequence conservation reflects the fact that although each individual YY1 motif is conserved they differ in number and spacing between different mammals.Figure 2.

Bottom Line: This program successfully identified many clustered YY1- and CTCF-binding sites that are conserved among these species but were previously undetected.In particular, this program identified CTCF-binding sites that are located close to the Dlk1, Magel2 and Cdkn1c imprinted genes.Individual ChIP experiments confirmed the in vivo binding of the YY1 and CTCF proteins to most of these newly discovered binding sites, demonstrating the feasibility and usefulness of ECMFinder.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.

ABSTRACT
We have developed a new bioinformatics approach called ECMFinder (Evolutionary Conserved Motif Finder). This program searches for a given DNA motif within the entire genome of one species and uses the gene association information of a potential transcription factor-binding site (TFBS) to screen the homologous regions of a second and third species. If multiple species have this potential TFBS in homologous positions, this program recognizes the identified TFBS as an evolutionary conserved motif (ECM). This program outputs a list of ECMs, which can be uploaded as a Custom Track in the UCSC genome browser and can be visualized along with other available data. The feasibility of this approach was tested by searching the genomes of three mammals (human, mouse and cow) with the DNA-binding motifs of YY1 and CTCF. This program successfully identified many clustered YY1- and CTCF-binding sites that are conserved among these species but were previously undetected. In particular, this program identified CTCF-binding sites that are located close to the Dlk1, Magel2 and Cdkn1c imprinted genes. Individual ChIP experiments confirmed the in vivo binding of the YY1 and CTCF proteins to most of these newly discovered binding sites, demonstrating the feasibility and usefulness of ECMFinder.

Show MeSH