Limits...
OASIS: an automated program for global investigation of bacterial and archaeal insertion sequences.

Robinson DG, Lee MC, Marx CJ - Nucleic Acids Res. (2012)

Bottom Line: At a broad scale, we found that most IS families are quite widespread; however, they are not present randomly across taxa.The number of ISs increases with genome length, but there is both tremendous variation and no increase in IS density for genomes >2 Mb.Surprisingly, even after controlling for 16S rRNA sequence divergence, the same ISs were more likely to be shared between genomes labeled as the same species rather than as different species.

View Article: PubMed Central - PubMed

Affiliation: Department of Organismic and Evolutionary Biology and Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, MA 02138, USA.

ABSTRACT
Insertion sequences (ISs) are simple transposable elements present in most bacterial and archaeal genomes and play an important role in genomic evolution. The recent expansion of sequenced genomes offers the opportunity to study ISs comprehensively, but this requires efficient and accurate tools for IS annotation. We have developed an open-source program called OASIS, or Optimized Annotation System for Insertion Sequences, which automatically annotates ISs within sequenced genomes. OASIS annotations of 1737 bacterial and archaeal genomes offered an unprecedented opportunity to examine IS evolution. At a broad scale, we found that most IS families are quite widespread; however, they are not present randomly across taxa. This may indicate differential loss, barriers to exchange and/or insufficient time to equilibrate across clades. The number of ISs increases with genome length, but there is both tremendous variation and no increase in IS density for genomes >2 Mb. At the finer scale of recently diverged genomes, the proportion of shared IS content falls sharply, suggesting loss and/or emergence of barriers to successful cross-infection occurs rapidly. Surprisingly, even after controlling for 16S rRNA sequence divergence, the same ISs were more likely to be shared between genomes labeled as the same species rather than as different species.

Show MeSH

Related in: MedlinePlus

A plot of the probability of two genomes sharing an IS copy compared to their 16S distance, for: (A) intraspecies and (B) interspecies pairs. An average using a Nadaraya–Watson kernel smoother with a bandwidth of 1% is shown in red. Note that at all 16S sequence distances the intraspecies value remains at least 2-fold higher than interspecies.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3526298&req=5

gks778-F5: A plot of the probability of two genomes sharing an IS copy compared to their 16S distance, for: (A) intraspecies and (B) interspecies pairs. An average using a Nadaraya–Watson kernel smoother with a bandwidth of 1% is shown in red. Note that at all 16S sequence distances the intraspecies value remains at least 2-fold higher than interspecies.

Mentions: The large number of taxa in our OASIS+ data set that represent closely related genomes (strains within the same species or genus) allowed us to address the extent to which IS turnover is seen at a finer scale. Previous studies have determined that IS counts can vary widely even between closely related genomes (26,33,34). Most of these studies concluded that ISs are short lived in natural lineages and that rapid HGT is required for them to persist. To investigate the short-term proliferation and survival of ISs, we compared the fraction of ISs shared in each pair of closely related genomes with their 16S divergence. The results show that the probability of an IS being shared between genomes decreases dramatically with increasing 16S distance: an IS has a 35.1% chance of being shared between two genomes that have identical 16S sequences, while it has only a 0.13% chance of being shared between two genomes with 9–10% 16S sequence divergence. Also notably, the probability of an IS being shared between two genomes in the same species is 24.7%, while two genomes of different species within 10% 16S divergence have only a 2.2% chance of being shared. Figure 5 compares the divergence between each genomic pair to the percentage of ISs shared between them, for both intraspecies and interspecies pairs and shows that probability of sharing decreases very quickly with increasing divergence. This rapid decrease in shared IS content during the early divergence of genomes could be due to a decline in the probability of vertical inheritance or in the chance of being acquired horizontally. Figure 5 also shows for a given degree of 16S sequence divergence, pairs of taxa that are defined as the same species have more similar IS content than if they are defined as different species. As an example, the probability of an IS being shared between two Escherichia genomes is 17.4% and the probability of being shared between two Shigella genomes is 41.1%, while the probability of being shared between an Escherichia genome and a Shigella genome is only 4.7%. Shigella is a polyphyletic genus completely within the Escherichia genus (35) (the average 16S divergence between the two genera is 1.6%), so the difference in IS content might reflect biological differences rather than just the time since divergence.Figure 5.


OASIS: an automated program for global investigation of bacterial and archaeal insertion sequences.

Robinson DG, Lee MC, Marx CJ - Nucleic Acids Res. (2012)

A plot of the probability of two genomes sharing an IS copy compared to their 16S distance, for: (A) intraspecies and (B) interspecies pairs. An average using a Nadaraya–Watson kernel smoother with a bandwidth of 1% is shown in red. Note that at all 16S sequence distances the intraspecies value remains at least 2-fold higher than interspecies.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3526298&req=5

gks778-F5: A plot of the probability of two genomes sharing an IS copy compared to their 16S distance, for: (A) intraspecies and (B) interspecies pairs. An average using a Nadaraya–Watson kernel smoother with a bandwidth of 1% is shown in red. Note that at all 16S sequence distances the intraspecies value remains at least 2-fold higher than interspecies.
Mentions: The large number of taxa in our OASIS+ data set that represent closely related genomes (strains within the same species or genus) allowed us to address the extent to which IS turnover is seen at a finer scale. Previous studies have determined that IS counts can vary widely even between closely related genomes (26,33,34). Most of these studies concluded that ISs are short lived in natural lineages and that rapid HGT is required for them to persist. To investigate the short-term proliferation and survival of ISs, we compared the fraction of ISs shared in each pair of closely related genomes with their 16S divergence. The results show that the probability of an IS being shared between genomes decreases dramatically with increasing 16S distance: an IS has a 35.1% chance of being shared between two genomes that have identical 16S sequences, while it has only a 0.13% chance of being shared between two genomes with 9–10% 16S sequence divergence. Also notably, the probability of an IS being shared between two genomes in the same species is 24.7%, while two genomes of different species within 10% 16S divergence have only a 2.2% chance of being shared. Figure 5 compares the divergence between each genomic pair to the percentage of ISs shared between them, for both intraspecies and interspecies pairs and shows that probability of sharing decreases very quickly with increasing divergence. This rapid decrease in shared IS content during the early divergence of genomes could be due to a decline in the probability of vertical inheritance or in the chance of being acquired horizontally. Figure 5 also shows for a given degree of 16S sequence divergence, pairs of taxa that are defined as the same species have more similar IS content than if they are defined as different species. As an example, the probability of an IS being shared between two Escherichia genomes is 17.4% and the probability of being shared between two Shigella genomes is 41.1%, while the probability of being shared between an Escherichia genome and a Shigella genome is only 4.7%. Shigella is a polyphyletic genus completely within the Escherichia genus (35) (the average 16S divergence between the two genera is 1.6%), so the difference in IS content might reflect biological differences rather than just the time since divergence.Figure 5.

Bottom Line: At a broad scale, we found that most IS families are quite widespread; however, they are not present randomly across taxa.The number of ISs increases with genome length, but there is both tremendous variation and no increase in IS density for genomes >2 Mb.Surprisingly, even after controlling for 16S rRNA sequence divergence, the same ISs were more likely to be shared between genomes labeled as the same species rather than as different species.

View Article: PubMed Central - PubMed

Affiliation: Department of Organismic and Evolutionary Biology and Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, MA 02138, USA.

ABSTRACT
Insertion sequences (ISs) are simple transposable elements present in most bacterial and archaeal genomes and play an important role in genomic evolution. The recent expansion of sequenced genomes offers the opportunity to study ISs comprehensively, but this requires efficient and accurate tools for IS annotation. We have developed an open-source program called OASIS, or Optimized Annotation System for Insertion Sequences, which automatically annotates ISs within sequenced genomes. OASIS annotations of 1737 bacterial and archaeal genomes offered an unprecedented opportunity to examine IS evolution. At a broad scale, we found that most IS families are quite widespread; however, they are not present randomly across taxa. This may indicate differential loss, barriers to exchange and/or insufficient time to equilibrate across clades. The number of ISs increases with genome length, but there is both tremendous variation and no increase in IS density for genomes >2 Mb. At the finer scale of recently diverged genomes, the proportion of shared IS content falls sharply, suggesting loss and/or emergence of barriers to successful cross-infection occurs rapidly. Surprisingly, even after controlling for 16S rRNA sequence divergence, the same ISs were more likely to be shared between genomes labeled as the same species rather than as different species.

Show MeSH
Related in: MedlinePlus