Limits...
Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs.

Delihas N - BMC Genomics (2015)

Bottom Line: The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2.The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures.On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Genetics and Microbiology, School of Medicine, Stony, Brook University, Stony Brook, NY, 11794, USA. Nicholas.delihas@stonybrook.edu.

ABSTRACT

Background: DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis and functional entities as the homologous sequence in the chimpanzee genome could be aligned and used for comparisons.

Methods: The GenBank database provided genomic sequences. In silico computer programs were used to find homologous DNA sequences in human and chimpanzee genomes, generate random sequences, determine DNA sequence alignments, sequence comparisons and nucleotide repeat copies, and to predicted DNA secondary structures.

Results: At its 5' half, the 10,000 bp sequence has three distinct sections that represent phylogenetically variable sequences. These Variable Regions contain biased mutations with a very high A + T content, multiple copies of the motif TATAATATA and sequences that fold into long A:T-base-paired stem loops. The 3' half of the 10,000 bp unit, highly conserved between human and chimpanzee, has sequences representing exons of lncRNA genes and segments of introns of protein genes. Central to the 10,000 bp unit are the multiple copies of a sequence that originates from the flanking 5' end of the translocation breakpoint Type A sequence. This breakpoint flanking sequence carries the exon and intron motifs. The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2.

Conclusions: The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures. On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs.

No MeSH data available.


Related in: MedlinePlus

Predicted DNA secondary structures generated by mfold (http://mfold.rna.albany.edu/?q=mfold/dna-folding-form). a Sequence from human Variable Region #3; b Sequence from chimpanzee Variable Region #3
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4607176&req=5

Fig4: Predicted DNA secondary structures generated by mfold (http://mfold.rna.albany.edu/?q=mfold/dna-folding-form). a Sequence from human Variable Region #3; b Sequence from chimpanzee Variable Region #3

Mentions: Variable regions were also analyzed for DNA secondary structure features. The structure of Chr22 TYPE C Accession:AB538237.2 PATRR (Fig. 3a), which displays a very high frequency of translocation [10] is used here as a model for DNA secondary structure and high translocation. In addition, the predicted structure for a typical sample random sequence is also shown (Fig. 3b). All three Variable Regions of the human 10,000 bp segment show at least one long A:T base pair-rich stem loop structure, albeit the human Variable Region #1 sequence folds into a poorly formed stem loop. We use the sequence from Variable Region #3 as a model, which displays the best-formed stem loops. Human and chimpanzee predicted secondary structures from this region are shown in Fig. 4a and B, respectively. The human structure shows two long stem loops that are fairly well formed; the chimpanzee has one. The lengths of the human stem loops are 120 bp (stem loop 2) and 109 bp (stem loop 1); the chimpanzee structure shows 106 bp (Table 2). A comparison of the long stem loops between human and chimpanzee shows that the human structures have fewer looped out regions and smaller stem protruding “mini-stem loops” (Fig. 4). The entire Variable Region #3 is also more thermodynamically stable in humans than in chimpanzee, 257 kcal/mol vs 164 kcal/mol, respectively (Table 3) but the Gibbs free energies for the stem loops alone are only moderately more stable in human samples compared to the chimpanzee. Overall, the human stem loop structures are much closer to the model translocation hot spot structure (Fig. 3a) than that of the chimpanzee.Fig. 3


Complexity of a small non-protein coding sequence in chromosomal region 22q11.2: presence of specialized DNA secondary structures and RNA exon/intron motifs.

Delihas N - BMC Genomics (2015)

Predicted DNA secondary structures generated by mfold (http://mfold.rna.albany.edu/?q=mfold/dna-folding-form). a Sequence from human Variable Region #3; b Sequence from chimpanzee Variable Region #3
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4607176&req=5

Fig4: Predicted DNA secondary structures generated by mfold (http://mfold.rna.albany.edu/?q=mfold/dna-folding-form). a Sequence from human Variable Region #3; b Sequence from chimpanzee Variable Region #3
Mentions: Variable regions were also analyzed for DNA secondary structure features. The structure of Chr22 TYPE C Accession:AB538237.2 PATRR (Fig. 3a), which displays a very high frequency of translocation [10] is used here as a model for DNA secondary structure and high translocation. In addition, the predicted structure for a typical sample random sequence is also shown (Fig. 3b). All three Variable Regions of the human 10,000 bp segment show at least one long A:T base pair-rich stem loop structure, albeit the human Variable Region #1 sequence folds into a poorly formed stem loop. We use the sequence from Variable Region #3 as a model, which displays the best-formed stem loops. Human and chimpanzee predicted secondary structures from this region are shown in Fig. 4a and B, respectively. The human structure shows two long stem loops that are fairly well formed; the chimpanzee has one. The lengths of the human stem loops are 120 bp (stem loop 2) and 109 bp (stem loop 1); the chimpanzee structure shows 106 bp (Table 2). A comparison of the long stem loops between human and chimpanzee shows that the human structures have fewer looped out regions and smaller stem protruding “mini-stem loops” (Fig. 4). The entire Variable Region #3 is also more thermodynamically stable in humans than in chimpanzee, 257 kcal/mol vs 164 kcal/mol, respectively (Table 3) but the Gibbs free energies for the stem loops alone are only moderately more stable in human samples compared to the chimpanzee. Overall, the human stem loop structures are much closer to the model translocation hot spot structure (Fig. 3a) than that of the chimpanzee.Fig. 3

Bottom Line: The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2.The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures.On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Genetics and Microbiology, School of Medicine, Stony, Brook University, Stony Brook, NY, 11794, USA. Nicholas.delihas@stonybrook.edu.

ABSTRACT

Background: DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis and functional entities as the homologous sequence in the chimpanzee genome could be aligned and used for comparisons.

Methods: The GenBank database provided genomic sequences. In silico computer programs were used to find homologous DNA sequences in human and chimpanzee genomes, generate random sequences, determine DNA sequence alignments, sequence comparisons and nucleotide repeat copies, and to predicted DNA secondary structures.

Results: At its 5' half, the 10,000 bp sequence has three distinct sections that represent phylogenetically variable sequences. These Variable Regions contain biased mutations with a very high A + T content, multiple copies of the motif TATAATATA and sequences that fold into long A:T-base-paired stem loops. The 3' half of the 10,000 bp unit, highly conserved between human and chimpanzee, has sequences representing exons of lncRNA genes and segments of introns of protein genes. Central to the 10,000 bp unit are the multiple copies of a sequence that originates from the flanking 5' end of the translocation breakpoint Type A sequence. This breakpoint flanking sequence carries the exon and intron motifs. The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2.

Conclusions: The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures. On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs.

No MeSH data available.


Related in: MedlinePlus