Limits...
Comparative genomics of Bacillus thuringiensis phage 0305phi8-36: defining patterns of descent in a novel ancient phage lineage.

Hardies SC, Thomas JA, Serwer P - Virol. J. (2007)

Bottom Line: Other segments were best described as multigene units engaged in modular horizontal exchange.Genomic organization at a level higher than individual gene sequence comparison can be analyzed to aid in understanding large phage genomes.Methods of analysis include 1) applying a time scale, 2) augmenting blast scores with positional information, 3) categorizing genomic rearrangements into one of several processes with characteristic rates and outcomes, and 4) correlating apparent transcript sizes with genomic position, gene content, and promoter motifs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, Texas 78229-3900, USA. hardies@uthscsa.edu

ABSTRACT

Background: The recently sequenced 218 kb genome of morphologically atypical Bacillus thuringiensis phage 0305phi8-36 exhibited only limited detectable homology to known bacteriophages. The only known relative of this phage is a string of phage-like genes called BtI1 in the chromosome of B. thuringiensis israelensis. The high degree of divergence and novelty of phage genomes pose challenges in how to describe the phage from its genomic sequences.

Results: Phage 0305phi8-36 and BtI1 are estimated to have diverged 2.0 - 2.5 billion years ago. Positionally biased Blast searches aligned 30 homologous structure or morphogenesis genes between 0305phi8-36 and BtI1 that have maintained the same gene order. Functional clustering of the genes helped identify additional gene functions. A conserved long tape measure gene indicates that a long tail is an evolutionarily stable property of this phage lineage. An unusual form of the tail chaperonin system split to two genes was characterized, as was a hyperplastic homologue of the T4gp27 hub gene. Within this region some segments were best described as encoding a conservative array of structure domains fused with a variable component of exchangeable domains. Other segments were best described as multigene units engaged in modular horizontal exchange. The non-structure genes of 0305phi8-36 appear to include the remnants of two replicative systems leading to the hypothesis that the genome plan was created by fusion of two ancestral viruses. The case for a member of the RNAi RNA-directed RNA polymerase family residing in 0305phi8-36 was strengthened by extending the hidden Markov model of this family. Finally, it was noted that prospective transcriptional promoters were distributed in a gradient of small to large transcripts starting from a fixed end of the genome.

Conclusion: Genomic organization at a level higher than individual gene sequence comparison can be analyzed to aid in understanding large phage genomes. Methods of analysis include 1) applying a time scale, 2) augmenting blast scores with positional information, 3) categorizing genomic rearrangements into one of several processes with characteristic rates and outcomes, and 4) correlating apparent transcript sizes with genomic position, gene content, and promoter motifs.

Show MeSH

Related in: MedlinePlus

Homology among the T4 gp 27 hub family, the P2 gpD family, and the 0305φ8-36 gp147 family. Domain 1 and 3 refer to folding domains described for the T4 gp27 hub [26]. Sequences within each family were aligned by SAM, and converted to logos as indicated in Methods. The logo segments shown are aligned with each other as found by HHSearch [25] without assistance from secondary structure. Secondary structure was annotated subsequent to the alignment to act as a second opinion on its quality. Red and blue bars below the T4 logos represent α helixes and β strands from the crystal structure. Red and blue bars below the other logos represent secondary structure predictions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2147016&req=5

Figure 4: Homology among the T4 gp 27 hub family, the P2 gpD family, and the 0305φ8-36 gp147 family. Domain 1 and 3 refer to folding domains described for the T4 gp27 hub [26]. Sequences within each family were aligned by SAM, and converted to logos as indicated in Methods. The logo segments shown are aligned with each other as found by HHSearch [25] without assistance from secondary structure. Secondary structure was annotated subsequent to the alignment to act as a second opinion on its quality. Red and blue bars below the T4 logos represent α helixes and β strands from the crystal structure. Red and blue bars below the other logos represent secondary structure predictions.

Mentions: Gp147 from 0305φ8-36 was functionally assigned as a homologue of T4 hub protein gp27 through the use of hidden Markov models (HMMs) of myoviral protein families starting with the virion proteins of bacteriophage P2 [1]. The HMM developed from P2 gpD was able to identify over 1200 homologues in phage and bacterial genomes, including one gene in nearly all known myoviral genomes and including T4 gp27 and its known homologues from T4-like phages. The HMM comparison program, HHSearch [25], found the T4 gp27 3D structure [26] within the HHpred pdb HMM library [27] using the P2 gpD HMM as the search key with E = 1 × 10-14, allowing a functional assignment to all members of the family. Gp147 from 0305φ8-36 was among the most divergent family members, matching in only folding domains 1 and 3 of the 4 domain structure (Figure 4). The match in domain 3 was strong enough to allow SAM to pick orf147 out of the 0305φ8-36 genome with E = 6.5 × 10-8. An HMM was composed from 0305φ8-36 gp147 and its BtI1 homologue and embedded in the HHpred HHM library. HHSearch picked out the gp147 model on the strength of the domain 3 match at E = 0.11. The domain 1 match was subsequently found by an HHM versus single HHM HHSearch comparison at E = 0.015. There is suitable length of sequence in gp147 to form domains 2 and 4, but the sequence is more divergent in these regions in all comparisons and these domains are not recognizable between 0305φ8-36 gp147 and its BtI1 homologue. Structurally, the two recognizable domains form a ring proximal to the end of the tail tube, whereas the two unrecognizable domains project towards the lysozyme chamber of the hub [26].


Comparative genomics of Bacillus thuringiensis phage 0305phi8-36: defining patterns of descent in a novel ancient phage lineage.

Hardies SC, Thomas JA, Serwer P - Virol. J. (2007)

Homology among the T4 gp 27 hub family, the P2 gpD family, and the 0305φ8-36 gp147 family. Domain 1 and 3 refer to folding domains described for the T4 gp27 hub [26]. Sequences within each family were aligned by SAM, and converted to logos as indicated in Methods. The logo segments shown are aligned with each other as found by HHSearch [25] without assistance from secondary structure. Secondary structure was annotated subsequent to the alignment to act as a second opinion on its quality. Red and blue bars below the T4 logos represent α helixes and β strands from the crystal structure. Red and blue bars below the other logos represent secondary structure predictions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2147016&req=5

Figure 4: Homology among the T4 gp 27 hub family, the P2 gpD family, and the 0305φ8-36 gp147 family. Domain 1 and 3 refer to folding domains described for the T4 gp27 hub [26]. Sequences within each family were aligned by SAM, and converted to logos as indicated in Methods. The logo segments shown are aligned with each other as found by HHSearch [25] without assistance from secondary structure. Secondary structure was annotated subsequent to the alignment to act as a second opinion on its quality. Red and blue bars below the T4 logos represent α helixes and β strands from the crystal structure. Red and blue bars below the other logos represent secondary structure predictions.
Mentions: Gp147 from 0305φ8-36 was functionally assigned as a homologue of T4 hub protein gp27 through the use of hidden Markov models (HMMs) of myoviral protein families starting with the virion proteins of bacteriophage P2 [1]. The HMM developed from P2 gpD was able to identify over 1200 homologues in phage and bacterial genomes, including one gene in nearly all known myoviral genomes and including T4 gp27 and its known homologues from T4-like phages. The HMM comparison program, HHSearch [25], found the T4 gp27 3D structure [26] within the HHpred pdb HMM library [27] using the P2 gpD HMM as the search key with E = 1 × 10-14, allowing a functional assignment to all members of the family. Gp147 from 0305φ8-36 was among the most divergent family members, matching in only folding domains 1 and 3 of the 4 domain structure (Figure 4). The match in domain 3 was strong enough to allow SAM to pick orf147 out of the 0305φ8-36 genome with E = 6.5 × 10-8. An HMM was composed from 0305φ8-36 gp147 and its BtI1 homologue and embedded in the HHpred HHM library. HHSearch picked out the gp147 model on the strength of the domain 3 match at E = 0.11. The domain 1 match was subsequently found by an HHM versus single HHM HHSearch comparison at E = 0.015. There is suitable length of sequence in gp147 to form domains 2 and 4, but the sequence is more divergent in these regions in all comparisons and these domains are not recognizable between 0305φ8-36 gp147 and its BtI1 homologue. Structurally, the two recognizable domains form a ring proximal to the end of the tail tube, whereas the two unrecognizable domains project towards the lysozyme chamber of the hub [26].

Bottom Line: Other segments were best described as multigene units engaged in modular horizontal exchange.Genomic organization at a level higher than individual gene sequence comparison can be analyzed to aid in understanding large phage genomes.Methods of analysis include 1) applying a time scale, 2) augmenting blast scores with positional information, 3) categorizing genomic rearrangements into one of several processes with characteristic rates and outcomes, and 4) correlating apparent transcript sizes with genomic position, gene content, and promoter motifs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, Texas 78229-3900, USA. hardies@uthscsa.edu

ABSTRACT

Background: The recently sequenced 218 kb genome of morphologically atypical Bacillus thuringiensis phage 0305phi8-36 exhibited only limited detectable homology to known bacteriophages. The only known relative of this phage is a string of phage-like genes called BtI1 in the chromosome of B. thuringiensis israelensis. The high degree of divergence and novelty of phage genomes pose challenges in how to describe the phage from its genomic sequences.

Results: Phage 0305phi8-36 and BtI1 are estimated to have diverged 2.0 - 2.5 billion years ago. Positionally biased Blast searches aligned 30 homologous structure or morphogenesis genes between 0305phi8-36 and BtI1 that have maintained the same gene order. Functional clustering of the genes helped identify additional gene functions. A conserved long tape measure gene indicates that a long tail is an evolutionarily stable property of this phage lineage. An unusual form of the tail chaperonin system split to two genes was characterized, as was a hyperplastic homologue of the T4gp27 hub gene. Within this region some segments were best described as encoding a conservative array of structure domains fused with a variable component of exchangeable domains. Other segments were best described as multigene units engaged in modular horizontal exchange. The non-structure genes of 0305phi8-36 appear to include the remnants of two replicative systems leading to the hypothesis that the genome plan was created by fusion of two ancestral viruses. The case for a member of the RNAi RNA-directed RNA polymerase family residing in 0305phi8-36 was strengthened by extending the hidden Markov model of this family. Finally, it was noted that prospective transcriptional promoters were distributed in a gradient of small to large transcripts starting from a fixed end of the genome.

Conclusion: Genomic organization at a level higher than individual gene sequence comparison can be analyzed to aid in understanding large phage genomes. Methods of analysis include 1) applying a time scale, 2) augmenting blast scores with positional information, 3) categorizing genomic rearrangements into one of several processes with characteristic rates and outcomes, and 4) correlating apparent transcript sizes with genomic position, gene content, and promoter motifs.

Show MeSH
Related in: MedlinePlus