Limits...
Hidden chromosome symmetry: in silico transformation reveals symmetry in 2D DNA walk trajectories of 671 chromosomes.

Poptsova MS, Larionov SA, Ryadchenko EV, Rybalko SD, Zakharov IA, Loskutov A - PLoS ONE (2009)

Bottom Line: This is also true for human coding sequences (CDS), which comprise only several percent of the entire chromosome length.We found that frequency distributions of the length of gene clusters, continuously located on the same strand, have close values for both strands.Contribution of different subsystems to the noted symmetries and distributions, and evolutionary aspects of symmetry are discussed.

View Article: PubMed Central - PubMed

Affiliation: University of Connecticut, Storrs, Connecticut, United States of America. maria.poptsova@gmail.com

ABSTRACT
Maps of 2D DNA walk of 671 examined chromosomes show composition complexity change from symmetrical half-turn in bacteria to pseudo-random trajectories in archaea, fungi and humans. In silico transformation of gene order and strand position returns most of the analyzed chromosomes to a symmetrical bacterial-like state with one transition point. The transformed chromosomal sequences also reveal remarkable segmental compositional symmetry between regions from different strands located equidistantly from the transition point. Despite extensive chromosome rearrangement the relation of gene numbers on opposite strands for chromosomes of different taxa varies in narrow limits around unity with Pearson coefficient r = 0.98. Similar relation is observed for total genes' length (r = 0.86) and cumulative GC (r = 0.95) and AT (r = 0.97) skews. This is also true for human coding sequences (CDS), which comprise only several percent of the entire chromosome length. We found that frequency distributions of the length of gene clusters, continuously located on the same strand, have close values for both strands. Eukaryotic gene distribution is believed to be non-random. Contribution of different subsystems to the noted symmetries and distributions, and evolutionary aspects of symmetry are discussed.

Show MeSH

Related in: MedlinePlus

Sequences made up of genes only.Illustration of how genes from different strands are included in to the sequence: (a) piece of DNA with genes and intergenic areas; (b) sequence, made up of genes only.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2712679&req=5

pone-0006396-g009: Sequences made up of genes only.Illustration of how genes from different strands are included in to the sequence: (a) piece of DNA with genes and intergenic areas; (b) sequence, made up of genes only.

Mentions: The 2D DNA walk method works as follows. We map DNA sequence into the square lattice on the plane with G−C and A−T axes, where the origin (0,0) coincides with the first nucleotide in DNA sequence. For every consecutive nucleotide we make a step: for G–one step up, C–one step down, A–one step to the right, T–one step to the left [1]. Thus the original DNA sequence will be mapped as a certain trajectory on a plane with G−C and A−T axes (Figure 8). For mapping and drawing 2D DNA sequences we used homemade scripts, which run under MATLAB software (MathWorks, http://www.mathworks.com). 2D DNA walks of sequences, made up of genes only, are composed of concatenated gene sequences in the same order as they are located on the chromosome, without intergenic areas. Note that when applied to chromosomes, the 2D DNA walk method draws trajectories for only one, usually the “+”, strand. Thus, if a gene is located on the “+”-strand, the corresponding gene-sequence is cut from the “+”-strand, from the start position to the end position. For genes located on the “−”strand, the corresponding sequence is also cut from the “+”-strand, but now from the end position to the start position, because genes on the “−”-strand have the opposite direction (Figure 9). For example, if gene 1 is located on the “+”-strand ATGAAATTT…TGA, and gene 2 is on the “−”-strand ATGCCCGGG…TGA (Figure 9a), then the resulting concatenated sequence, made up of two genes, would be AGTAAATTT…TGATCA…CCCGGGCAT (Figure 9b). GSS transformation consists in merging genes from “+”-strand, without changing their order, with concatenated genes from the “−”-strand, excluding intergenic regions [29]. Sequences for genes from different strands are taken as described above (see also Figure 9).


Hidden chromosome symmetry: in silico transformation reveals symmetry in 2D DNA walk trajectories of 671 chromosomes.

Poptsova MS, Larionov SA, Ryadchenko EV, Rybalko SD, Zakharov IA, Loskutov A - PLoS ONE (2009)

Sequences made up of genes only.Illustration of how genes from different strands are included in to the sequence: (a) piece of DNA with genes and intergenic areas; (b) sequence, made up of genes only.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2712679&req=5

pone-0006396-g009: Sequences made up of genes only.Illustration of how genes from different strands are included in to the sequence: (a) piece of DNA with genes and intergenic areas; (b) sequence, made up of genes only.
Mentions: The 2D DNA walk method works as follows. We map DNA sequence into the square lattice on the plane with G−C and A−T axes, where the origin (0,0) coincides with the first nucleotide in DNA sequence. For every consecutive nucleotide we make a step: for G–one step up, C–one step down, A–one step to the right, T–one step to the left [1]. Thus the original DNA sequence will be mapped as a certain trajectory on a plane with G−C and A−T axes (Figure 8). For mapping and drawing 2D DNA sequences we used homemade scripts, which run under MATLAB software (MathWorks, http://www.mathworks.com). 2D DNA walks of sequences, made up of genes only, are composed of concatenated gene sequences in the same order as they are located on the chromosome, without intergenic areas. Note that when applied to chromosomes, the 2D DNA walk method draws trajectories for only one, usually the “+”, strand. Thus, if a gene is located on the “+”-strand, the corresponding gene-sequence is cut from the “+”-strand, from the start position to the end position. For genes located on the “−”strand, the corresponding sequence is also cut from the “+”-strand, but now from the end position to the start position, because genes on the “−”-strand have the opposite direction (Figure 9). For example, if gene 1 is located on the “+”-strand ATGAAATTT…TGA, and gene 2 is on the “−”-strand ATGCCCGGG…TGA (Figure 9a), then the resulting concatenated sequence, made up of two genes, would be AGTAAATTT…TGATCA…CCCGGGCAT (Figure 9b). GSS transformation consists in merging genes from “+”-strand, without changing their order, with concatenated genes from the “−”-strand, excluding intergenic regions [29]. Sequences for genes from different strands are taken as described above (see also Figure 9).

Bottom Line: This is also true for human coding sequences (CDS), which comprise only several percent of the entire chromosome length.We found that frequency distributions of the length of gene clusters, continuously located on the same strand, have close values for both strands.Contribution of different subsystems to the noted symmetries and distributions, and evolutionary aspects of symmetry are discussed.

View Article: PubMed Central - PubMed

Affiliation: University of Connecticut, Storrs, Connecticut, United States of America. maria.poptsova@gmail.com

ABSTRACT
Maps of 2D DNA walk of 671 examined chromosomes show composition complexity change from symmetrical half-turn in bacteria to pseudo-random trajectories in archaea, fungi and humans. In silico transformation of gene order and strand position returns most of the analyzed chromosomes to a symmetrical bacterial-like state with one transition point. The transformed chromosomal sequences also reveal remarkable segmental compositional symmetry between regions from different strands located equidistantly from the transition point. Despite extensive chromosome rearrangement the relation of gene numbers on opposite strands for chromosomes of different taxa varies in narrow limits around unity with Pearson coefficient r = 0.98. Similar relation is observed for total genes' length (r = 0.86) and cumulative GC (r = 0.95) and AT (r = 0.97) skews. This is also true for human coding sequences (CDS), which comprise only several percent of the entire chromosome length. We found that frequency distributions of the length of gene clusters, continuously located on the same strand, have close values for both strands. Eukaryotic gene distribution is believed to be non-random. Contribution of different subsystems to the noted symmetries and distributions, and evolutionary aspects of symmetry are discussed.

Show MeSH
Related in: MedlinePlus