Limits...
Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes.

Kono N, Arakawa K, Tomita M - BMC Genomics (2011)

Bottom Line: The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures.The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method.The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites.

View Article: PubMed Central - HTML - PubMed

Affiliation: Systems Biology Program, Graduate School of Media and Governance, Keio University, Endo 5322, Fujisawa, Kanagawa 252-8520, Japan.

ABSTRACT

Background: During the replication process of bacteria with circular chromosomes, an odd number of homologous recombination events results in concatenated dimer chromosomes that cannot be partitioned into daughter cells. However, many bacteria harbor a conserved dimer resolution machinery consisting of one or two tyrosine recombinases, XerC and XerD, and their 28-bp target site, dif.

Results: To study the evolution of the dif/XerCD system and its relationship with replication termination, we report the comprehensive prediction of dif sequences in silico using a phylogenetic prediction approach based on iterated hidden Markov modeling. Using this method, dif sites were identified in 641 organisms among 16 phyla, with a 97.64% identification rate for single-chromosome strains. The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures.

Conclusions: The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method. The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites.

Show MeSH

Related in: MedlinePlus

Prediction strategy. A. Example of the iterated HMM in Proteobacteria. The first seed profile hidden Markov model is created from the seed dif sequence of Escherichia coli, by searching for dif sequences in 28 genomes belonging to the genus Escherichia by means of fuzzy matching. Based on this initial profile hidden Markov model, dif sequences were predicted in the genomes of the closest genus to the Escherichia genus (in this case, Shigella) according to XerCD amino acid sequences. Subsequently, a new profile is created using the previous profile and the newly predicted dif sequences, and this new profile is used to predict in the second closest genus (in this case, Salmonella). In this way, profile creation and dif sequence prediction were repeated recursively in decreasing order of similarity of XerCD from the Escherichia sequence. In this way, iterated HMM is conducted for each phylum. B. Flow chart of the overall strategy.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3025954&req=5

Figure 4: Prediction strategy. A. Example of the iterated HMM in Proteobacteria. The first seed profile hidden Markov model is created from the seed dif sequence of Escherichia coli, by searching for dif sequences in 28 genomes belonging to the genus Escherichia by means of fuzzy matching. Based on this initial profile hidden Markov model, dif sequences were predicted in the genomes of the closest genus to the Escherichia genus (in this case, Shigella) according to XerCD amino acid sequences. Subsequently, a new profile is created using the previous profile and the newly predicted dif sequences, and this new profile is used to predict in the second closest genus (in this case, Salmonella). In this way, profile creation and dif sequence prediction were repeated recursively in decreasing order of similarity of XerCD from the Escherichia sequence. In this way, iterated HMM is conducted for each phylum. B. Flow chart of the overall strategy.

Mentions: Predictions failed in all species belonging to the phylum Cyanobacteria. Although XerCD is present in these species, the sequence similarity distance of XerCD in Cyanobacteria to those of other phyla was high (average of 0.358 ± 0.0159, N = 540), with a minimum distance of 0.322 to Actinosynnema mirum (Actinobacteria), which exceeded the 0.3 threshold that was shown in Figure 4. Therefore, this divergence of XerCD in Cyanobacteria from those of other phyla implies low applicability of the iterated HMM approach, which utilizes the phylogenetic conservation pattern of XerCD. One possible explanation for the prediction failure in this phylum is that the dif sequences and XerCD are highly divergent in Cyanobacteria, preventing their identification with sequence profiles. The replication origin in Cyanobacteria is yet to be identified, and GC skew is weak in these species, implying low degree of replicational mutation/selection pressures, which could also be a reason for the failure of prediction in these species.


Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes.

Kono N, Arakawa K, Tomita M - BMC Genomics (2011)

Prediction strategy. A. Example of the iterated HMM in Proteobacteria. The first seed profile hidden Markov model is created from the seed dif sequence of Escherichia coli, by searching for dif sequences in 28 genomes belonging to the genus Escherichia by means of fuzzy matching. Based on this initial profile hidden Markov model, dif sequences were predicted in the genomes of the closest genus to the Escherichia genus (in this case, Shigella) according to XerCD amino acid sequences. Subsequently, a new profile is created using the previous profile and the newly predicted dif sequences, and this new profile is used to predict in the second closest genus (in this case, Salmonella). In this way, profile creation and dif sequence prediction were repeated recursively in decreasing order of similarity of XerCD from the Escherichia sequence. In this way, iterated HMM is conducted for each phylum. B. Flow chart of the overall strategy.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3025954&req=5

Figure 4: Prediction strategy. A. Example of the iterated HMM in Proteobacteria. The first seed profile hidden Markov model is created from the seed dif sequence of Escherichia coli, by searching for dif sequences in 28 genomes belonging to the genus Escherichia by means of fuzzy matching. Based on this initial profile hidden Markov model, dif sequences were predicted in the genomes of the closest genus to the Escherichia genus (in this case, Shigella) according to XerCD amino acid sequences. Subsequently, a new profile is created using the previous profile and the newly predicted dif sequences, and this new profile is used to predict in the second closest genus (in this case, Salmonella). In this way, profile creation and dif sequence prediction were repeated recursively in decreasing order of similarity of XerCD from the Escherichia sequence. In this way, iterated HMM is conducted for each phylum. B. Flow chart of the overall strategy.
Mentions: Predictions failed in all species belonging to the phylum Cyanobacteria. Although XerCD is present in these species, the sequence similarity distance of XerCD in Cyanobacteria to those of other phyla was high (average of 0.358 ± 0.0159, N = 540), with a minimum distance of 0.322 to Actinosynnema mirum (Actinobacteria), which exceeded the 0.3 threshold that was shown in Figure 4. Therefore, this divergence of XerCD in Cyanobacteria from those of other phyla implies low applicability of the iterated HMM approach, which utilizes the phylogenetic conservation pattern of XerCD. One possible explanation for the prediction failure in this phylum is that the dif sequences and XerCD are highly divergent in Cyanobacteria, preventing their identification with sequence profiles. The replication origin in Cyanobacteria is yet to be identified, and GC skew is weak in these species, implying low degree of replicational mutation/selection pressures, which could also be a reason for the failure of prediction in these species.

Bottom Line: The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures.The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method.The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites.

View Article: PubMed Central - HTML - PubMed

Affiliation: Systems Biology Program, Graduate School of Media and Governance, Keio University, Endo 5322, Fujisawa, Kanagawa 252-8520, Japan.

ABSTRACT

Background: During the replication process of bacteria with circular chromosomes, an odd number of homologous recombination events results in concatenated dimer chromosomes that cannot be partitioned into daughter cells. However, many bacteria harbor a conserved dimer resolution machinery consisting of one or two tyrosine recombinases, XerC and XerD, and their 28-bp target site, dif.

Results: To study the evolution of the dif/XerCD system and its relationship with replication termination, we report the comprehensive prediction of dif sequences in silico using a phylogenetic prediction approach based on iterated hidden Markov modeling. Using this method, dif sites were identified in 641 organisms among 16 phyla, with a 97.64% identification rate for single-chromosome strains. The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures.

Conclusions: The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method. The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites.

Show MeSH
Related in: MedlinePlus