Limits...
Sequence characteristics of T4-like bacteriophage IME08 benome termini revealed by high throughput sequencing.

Jiang X, Jiang H, Li C, Wang S, Mi Z, An X, Chen J, Tong Y - Virol. J. (2011)

Bottom Line: The literature indicates that T4-like phage genomes have permuted terminal sequences, and are generated by a DNA terminase in a sequence-independent manner; genomic DNA of T4-like bacteriophage IME08 was subjected to high throughput sequencing, and the read sequences with extraordinarily high occurrences were analyzed; we demonstrate that both the 5' and 3' termini of the IME08 genome starts with base G or A.The presence of a consensus sequence TTGGA/G around the breakpoint of the high frequency read sequences suggests that the terminase cuts the branched pre-genome in a sequence-preferred manner.Our analysis also shows that terminal cleavage is asymmetric, with one end cut at a consensus sequence, and the other end generated randomly.

View Article: PubMed Central - HTML - PubMed

Affiliation: Beijing Institute of Microbiology and Epidemiology, Beijing 100071, China.

ABSTRACT

Background: T4 phage is a model species that has contributed broadly to our understanding of molecular biology. T4 DNA replication and packaging share various mechanisms with human double-stranded DNA viruses such as herpes virus. The literature indicates that T4-like phage genomes have permuted terminal sequences, and are generated by a DNA terminase in a sequence-independent manner;

Methods: genomic DNA of T4-like bacteriophage IME08 was subjected to high throughput sequencing, and the read sequences with extraordinarily high occurrences were analyzed;

Results: we demonstrate that both the 5' and 3' termini of the IME08 genome starts with base G or A. The presence of a consensus sequence TTGGA/G around the breakpoint of the high frequency read sequences suggests that the terminase cuts the branched pre-genome in a sequence-preferred manner. Our analysis also shows that terminal cleavage is asymmetric, with one end cut at a consensus sequence, and the other end generated randomly. The sequence-preferred cleavage may produce sticky-ends, but with each end being packaged with different efficiencies;

Conclusions: this study illustrates how high throughput sequencing can be used to probe replication and packaging mechanisms in bacteriophages and/or viruses.

Show MeSH

Related in: MedlinePlus

Percentage of first bases in read sequences. In read sequences with 5-15 occurrences, the first base percentage is comparable with the base composition in the genome. As the read sequence occurrence increases, the percentage of A and G goes up and comprises 100% of the first bases in sequences that occur more than 100 times. Base G is the only first base in sequences that occur greater than 160 times.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3105952&req=5

Figure 3: Percentage of first bases in read sequences. In read sequences with 5-15 occurrences, the first base percentage is comparable with the base composition in the genome. As the read sequence occurrence increases, the percentage of A and G goes up and comprises 100% of the first bases in sequences that occur more than 100 times. Base G is the only first base in sequences that occur greater than 160 times.

Mentions: When the first bases of the read sequences of all occurrences were plotted (Figure 3), a striking characteristic was revealed. The first bases of HSFs were dominated by A and G, and as sequence occurrence increased, G exceeded A as the major start base. G was the sole base for all sequences that occurred more than 160 times (Figure 3). The first base plot also revealed that for sequences that occurred more than 30 times, A and G comprised more than 80% of the first bases, with T or C making up less than 10% (percentage of C is fewer than that of T because the GC content of IME08 phage genome is only 39%). For sequences that occurred more than 100 times, there is never a T or C base located at the first base. This distinct difference in the first bases between high frequency sequences and normal frequency sequences again suggests that HFSs are different from the normal frequency sequences which are supposed to be generated randomly at library construction, and that these HFSs may represent the termini of the original genomic DNA. According to this hypothesis, G should compose the majority of the genome terminal bases since it is the only 5' terminal base in the HFSs with very high frequencies.


Sequence characteristics of T4-like bacteriophage IME08 benome termini revealed by high throughput sequencing.

Jiang X, Jiang H, Li C, Wang S, Mi Z, An X, Chen J, Tong Y - Virol. J. (2011)

Percentage of first bases in read sequences. In read sequences with 5-15 occurrences, the first base percentage is comparable with the base composition in the genome. As the read sequence occurrence increases, the percentage of A and G goes up and comprises 100% of the first bases in sequences that occur more than 100 times. Base G is the only first base in sequences that occur greater than 160 times.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3105952&req=5

Figure 3: Percentage of first bases in read sequences. In read sequences with 5-15 occurrences, the first base percentage is comparable with the base composition in the genome. As the read sequence occurrence increases, the percentage of A and G goes up and comprises 100% of the first bases in sequences that occur more than 100 times. Base G is the only first base in sequences that occur greater than 160 times.
Mentions: When the first bases of the read sequences of all occurrences were plotted (Figure 3), a striking characteristic was revealed. The first bases of HSFs were dominated by A and G, and as sequence occurrence increased, G exceeded A as the major start base. G was the sole base for all sequences that occurred more than 160 times (Figure 3). The first base plot also revealed that for sequences that occurred more than 30 times, A and G comprised more than 80% of the first bases, with T or C making up less than 10% (percentage of C is fewer than that of T because the GC content of IME08 phage genome is only 39%). For sequences that occurred more than 100 times, there is never a T or C base located at the first base. This distinct difference in the first bases between high frequency sequences and normal frequency sequences again suggests that HFSs are different from the normal frequency sequences which are supposed to be generated randomly at library construction, and that these HFSs may represent the termini of the original genomic DNA. According to this hypothesis, G should compose the majority of the genome terminal bases since it is the only 5' terminal base in the HFSs with very high frequencies.

Bottom Line: The literature indicates that T4-like phage genomes have permuted terminal sequences, and are generated by a DNA terminase in a sequence-independent manner; genomic DNA of T4-like bacteriophage IME08 was subjected to high throughput sequencing, and the read sequences with extraordinarily high occurrences were analyzed; we demonstrate that both the 5' and 3' termini of the IME08 genome starts with base G or A.The presence of a consensus sequence TTGGA/G around the breakpoint of the high frequency read sequences suggests that the terminase cuts the branched pre-genome in a sequence-preferred manner.Our analysis also shows that terminal cleavage is asymmetric, with one end cut at a consensus sequence, and the other end generated randomly.

View Article: PubMed Central - HTML - PubMed

Affiliation: Beijing Institute of Microbiology and Epidemiology, Beijing 100071, China.

ABSTRACT

Background: T4 phage is a model species that has contributed broadly to our understanding of molecular biology. T4 DNA replication and packaging share various mechanisms with human double-stranded DNA viruses such as herpes virus. The literature indicates that T4-like phage genomes have permuted terminal sequences, and are generated by a DNA terminase in a sequence-independent manner;

Methods: genomic DNA of T4-like bacteriophage IME08 was subjected to high throughput sequencing, and the read sequences with extraordinarily high occurrences were analyzed;

Results: we demonstrate that both the 5' and 3' termini of the IME08 genome starts with base G or A. The presence of a consensus sequence TTGGA/G around the breakpoint of the high frequency read sequences suggests that the terminase cuts the branched pre-genome in a sequence-preferred manner. Our analysis also shows that terminal cleavage is asymmetric, with one end cut at a consensus sequence, and the other end generated randomly. The sequence-preferred cleavage may produce sticky-ends, but with each end being packaged with different efficiencies;

Conclusions: this study illustrates how high throughput sequencing can be used to probe replication and packaging mechanisms in bacteriophages and/or viruses.

Show MeSH
Related in: MedlinePlus