Limits...
CG dinucleotide clustering is a species-specific property of the genome.

Glass JL, Thompson RF, Khulan B, Figueroa ME, Olivier EN, Oakley EJ, Van Zant G, Bouhassira EE, Melnick A, Golden A, Fazzari MJ, Greally JM - Nucleic Acids Res. (2007)

Bottom Line: We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions.Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches.Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA.

ABSTRACT
Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.

Show MeSH

Related in: MedlinePlus

CG cluster analysis of 10 different species. These CG fragment length frequency plots were generated using 30 CGs per fragment for each species. Genomes containing CG clusters are defined by the distinct peak of short, uniquely CG-dense fragments. While the three non-methylating organisms on the left (Saccharomyces cerevisiae, Caenorhabditis elegans and D. melanogaster) show no uniquely CG-dense peak, it was surprising to find that fugu has similar characteristics despite the fact that it methylates its genome (25). Zebrafish, on the other hand, which also methylates its genome, has a distinct CG-dense peak, as do the other vertebrate genomes on the right. There is more CG decay in zebrafish than fugu (O/E CG ratios of 0.525 and 0.571, respectively), but this marginal difference does not appear sufficient to account for the emergence of CG-dense clusters in zebrafish. Methylation of the genome is not, therefore, always accompanied by the presence of CG-dense loci that avoid mutational decay. For a more detailed illustration of the CG cluster analysis of these genomes, see the Supplementary Data section.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2175314&req=5

Figure 7: CG cluster analysis of 10 different species. These CG fragment length frequency plots were generated using 30 CGs per fragment for each species. Genomes containing CG clusters are defined by the distinct peak of short, uniquely CG-dense fragments. While the three non-methylating organisms on the left (Saccharomyces cerevisiae, Caenorhabditis elegans and D. melanogaster) show no uniquely CG-dense peak, it was surprising to find that fugu has similar characteristics despite the fact that it methylates its genome (25). Zebrafish, on the other hand, which also methylates its genome, has a distinct CG-dense peak, as do the other vertebrate genomes on the right. There is more CG decay in zebrafish than fugu (O/E CG ratios of 0.525 and 0.571, respectively), but this marginal difference does not appear sufficient to account for the emergence of CG-dense clusters in zebrafish. Methylation of the genome is not, therefore, always accompanied by the presence of CG-dense loci that avoid mutational decay. For a more detailed illustration of the CG cluster analysis of these genomes, see the Supplementary Data section.

Mentions: We extended the CG clustering histogram analysis to eight more genomes, including other organisms that are known to methylate their genomes, those that do so only transiently (Drosophila melanogaster) (23), and those that do not methylate at all. The surprising result of these analyses is that the fugu (Tiger Blowfish, Takifugu rubripes) genome, which has been described to methylate its DNA (24), does not exhibit uniquely CG-dense regions. What may explain this difference is that the degree of decay of CG dinucleotide content in the fugu genome is less than that of most genomes in which unique CG-dense regions emerge (Figure 7). The zebrafish (Danio rerio) genome, on the other hand, does display uniquely CG-dense regions with only marginally greater CG dinucleotide decay (O/E CG 0.53 as opposed to 0.57 in fugu). The remaining major difference between these genomes is that of size, the fugu genome being substantially smaller than the other methylating genomes at only 365 Mb total (25), a variable already suggested to be related to the evolution of cytosine methylation (26). Our data demonstrate that while cytosine methylation appears to be necessary for CG decay, it is not sufficient to cause local preservation of clustered CG dinucleotides. Furthermore, we can conclude that any annotation of the fugu genome to indicate the presence of CpG islands or CG clusters is inappropriate.Figure 7.


CG dinucleotide clustering is a species-specific property of the genome.

Glass JL, Thompson RF, Khulan B, Figueroa ME, Olivier EN, Oakley EJ, Van Zant G, Bouhassira EE, Melnick A, Golden A, Fazzari MJ, Greally JM - Nucleic Acids Res. (2007)

CG cluster analysis of 10 different species. These CG fragment length frequency plots were generated using 30 CGs per fragment for each species. Genomes containing CG clusters are defined by the distinct peak of short, uniquely CG-dense fragments. While the three non-methylating organisms on the left (Saccharomyces cerevisiae, Caenorhabditis elegans and D. melanogaster) show no uniquely CG-dense peak, it was surprising to find that fugu has similar characteristics despite the fact that it methylates its genome (25). Zebrafish, on the other hand, which also methylates its genome, has a distinct CG-dense peak, as do the other vertebrate genomes on the right. There is more CG decay in zebrafish than fugu (O/E CG ratios of 0.525 and 0.571, respectively), but this marginal difference does not appear sufficient to account for the emergence of CG-dense clusters in zebrafish. Methylation of the genome is not, therefore, always accompanied by the presence of CG-dense loci that avoid mutational decay. For a more detailed illustration of the CG cluster analysis of these genomes, see the Supplementary Data section.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2175314&req=5

Figure 7: CG cluster analysis of 10 different species. These CG fragment length frequency plots were generated using 30 CGs per fragment for each species. Genomes containing CG clusters are defined by the distinct peak of short, uniquely CG-dense fragments. While the three non-methylating organisms on the left (Saccharomyces cerevisiae, Caenorhabditis elegans and D. melanogaster) show no uniquely CG-dense peak, it was surprising to find that fugu has similar characteristics despite the fact that it methylates its genome (25). Zebrafish, on the other hand, which also methylates its genome, has a distinct CG-dense peak, as do the other vertebrate genomes on the right. There is more CG decay in zebrafish than fugu (O/E CG ratios of 0.525 and 0.571, respectively), but this marginal difference does not appear sufficient to account for the emergence of CG-dense clusters in zebrafish. Methylation of the genome is not, therefore, always accompanied by the presence of CG-dense loci that avoid mutational decay. For a more detailed illustration of the CG cluster analysis of these genomes, see the Supplementary Data section.
Mentions: We extended the CG clustering histogram analysis to eight more genomes, including other organisms that are known to methylate their genomes, those that do so only transiently (Drosophila melanogaster) (23), and those that do not methylate at all. The surprising result of these analyses is that the fugu (Tiger Blowfish, Takifugu rubripes) genome, which has been described to methylate its DNA (24), does not exhibit uniquely CG-dense regions. What may explain this difference is that the degree of decay of CG dinucleotide content in the fugu genome is less than that of most genomes in which unique CG-dense regions emerge (Figure 7). The zebrafish (Danio rerio) genome, on the other hand, does display uniquely CG-dense regions with only marginally greater CG dinucleotide decay (O/E CG 0.53 as opposed to 0.57 in fugu). The remaining major difference between these genomes is that of size, the fugu genome being substantially smaller than the other methylating genomes at only 365 Mb total (25), a variable already suggested to be related to the evolution of cytosine methylation (26). Our data demonstrate that while cytosine methylation appears to be necessary for CG decay, it is not sufficient to cause local preservation of clustered CG dinucleotides. Furthermore, we can conclude that any annotation of the fugu genome to indicate the presence of CpG islands or CG clusters is inappropriate.Figure 7.

Bottom Line: We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions.Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches.Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA.

ABSTRACT
Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.

Show MeSH
Related in: MedlinePlus