Limits...
Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases.

Elhai J - Life (Basel) (2015)

Bottom Line: The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes.Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites.A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.

View Article: PubMed Central - PubMed

Affiliation: Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA. ElhaiJ@vcu.edu.

ABSTRACT
The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.

No MeSH data available.


Most overrepresented 8-mers in selected genomes. Each panel shows the 12 most overrepresented 8-mers in genomes chosen to illustrate different classes. The calculations of the frequencies of the given 8-mer per million nucleotides (count/M) and the ratio of observed counts and expected counts (O/E) are described in the Methods section. Complete and partial HIP1 sequences are highlighted in green, and an overrepresented derivative of HIP1, TCGATCGA, is shown with differences from HIP1 in red font. Other, more sporadic differences from HIP1 are highlighted in red. GGCGCC sequences is highlighted in cyan and TGATCA in pink. 8-mers composed of a triplet repeat are represented in gray, with different shadings used to make the triplet repeat more clear. Palindromic sequences are marked with an asterisk. Nonpalindromic sequences represent themselves and their complement (e.g., CGATCGCC/GGCGATCG), and their frequencies are an average of the two.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4390886&req=5

life-05-00921-f003: Most overrepresented 8-mers in selected genomes. Each panel shows the 12 most overrepresented 8-mers in genomes chosen to illustrate different classes. The calculations of the frequencies of the given 8-mer per million nucleotides (count/M) and the ratio of observed counts and expected counts (O/E) are described in the Methods section. Complete and partial HIP1 sequences are highlighted in green, and an overrepresented derivative of HIP1, TCGATCGA, is shown with differences from HIP1 in red font. Other, more sporadic differences from HIP1 are highlighted in red. GGCGCC sequences is highlighted in cyan and TGATCA in pink. 8-mers composed of a triplet repeat are represented in gray, with different shadings used to make the triplet repeat more clear. Palindromic sequences are marked with an asterisk. Nonpalindromic sequences represent themselves and their complement (e.g., CGATCGCC/GGCGATCG), and their frequencies are an average of the two.

Mentions: In order to assess whether the cyanobacterial genomes with low HIP1 frequencies exhibit a different high frequency 8-mer, I examined the 8-mer frequencies of all the genomes. The patterns of results fall into different classes, representative samples of which are shown in Figure 3. The list of top 8-mers in Anabaena PCC 7120 (Figure 3A) is typical for those genomes with highly overrepresented HIP1 sequences. After the HIP1 sequence itself, the next most overrepresented 8-mers are those that overlap HIP1. At low O/E ratios, 8-mers appear that are triplet repeats. At least in the case of Anabaena, they occur almost exclusively in coding regions and associated with a specific reading frame, and may therefore be determined by amino acid and codon preferences.


Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases.

Elhai J - Life (Basel) (2015)

Most overrepresented 8-mers in selected genomes. Each panel shows the 12 most overrepresented 8-mers in genomes chosen to illustrate different classes. The calculations of the frequencies of the given 8-mer per million nucleotides (count/M) and the ratio of observed counts and expected counts (O/E) are described in the Methods section. Complete and partial HIP1 sequences are highlighted in green, and an overrepresented derivative of HIP1, TCGATCGA, is shown with differences from HIP1 in red font. Other, more sporadic differences from HIP1 are highlighted in red. GGCGCC sequences is highlighted in cyan and TGATCA in pink. 8-mers composed of a triplet repeat are represented in gray, with different shadings used to make the triplet repeat more clear. Palindromic sequences are marked with an asterisk. Nonpalindromic sequences represent themselves and their complement (e.g., CGATCGCC/GGCGATCG), and their frequencies are an average of the two.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4390886&req=5

life-05-00921-f003: Most overrepresented 8-mers in selected genomes. Each panel shows the 12 most overrepresented 8-mers in genomes chosen to illustrate different classes. The calculations of the frequencies of the given 8-mer per million nucleotides (count/M) and the ratio of observed counts and expected counts (O/E) are described in the Methods section. Complete and partial HIP1 sequences are highlighted in green, and an overrepresented derivative of HIP1, TCGATCGA, is shown with differences from HIP1 in red font. Other, more sporadic differences from HIP1 are highlighted in red. GGCGCC sequences is highlighted in cyan and TGATCA in pink. 8-mers composed of a triplet repeat are represented in gray, with different shadings used to make the triplet repeat more clear. Palindromic sequences are marked with an asterisk. Nonpalindromic sequences represent themselves and their complement (e.g., CGATCGCC/GGCGATCG), and their frequencies are an average of the two.
Mentions: In order to assess whether the cyanobacterial genomes with low HIP1 frequencies exhibit a different high frequency 8-mer, I examined the 8-mer frequencies of all the genomes. The patterns of results fall into different classes, representative samples of which are shown in Figure 3. The list of top 8-mers in Anabaena PCC 7120 (Figure 3A) is typical for those genomes with highly overrepresented HIP1 sequences. After the HIP1 sequence itself, the next most overrepresented 8-mers are those that overlap HIP1. At low O/E ratios, 8-mers appear that are triplet repeats. At least in the case of Anabaena, they occur almost exclusively in coding regions and associated with a specific reading frame, and may therefore be determined by amino acid and codon preferences.

Bottom Line: The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes.Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites.A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.

View Article: PubMed Central - PubMed

Affiliation: Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA. ElhaiJ@vcu.edu.

ABSTRACT
The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.

No MeSH data available.