Limits...
Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago.

Rasmussen S, Allentoft ME, Nielsen K, Orlando L, Sikora M, Sjögren KG, Pedersen AG, Schubert M, Van Dam A, Kapel CM, Nielsen HB, Brunak S, Avetisyan P, Epimakhov A, Khalyapin MV, Gnuni A, Kriiska A, Lasak I, Metspalu M, Moiseyev V, Gromov A, Pokutta D, Saag L, Varul L, Yepiskoposyan L, Sicheritz-Pontén T, Foley RA, Lahr MM, Nielsen R, Kristiansen K, Willerslev E - Cell (2015)

Bottom Line: How and when it originated remains contentious.We also identify a temporal sequence of genetic changes that lead to increased virulence and the emergence of the bubonic plague.Our results show that plague infection was endemic in the human populations of Eurasia at least 3,000 years before any historical recordings of pandemics.

View Article: PubMed Central - PubMed

Affiliation: Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, Building 208, 2800 Kongens Lyngby, Denmark.

Show MeSH

Related in: MedlinePlus

Mapping Affinity, Related to Figure 3(A) Distribution of edit distance of high quality reads of known origin and the eight Yersinia associated samples. The investigated, known reads are from Y. pestis 620024 (0.PE7), Y. pestis D1982001 (1.IN2), Y. pseudo (IP32464) (from the clade closest to Y. pestis), and Y. similis (which is an outgroup to both Y. pestis and Y. pseudotuberculosis). For RISE00, RISE139, RISE386, RISE397, RISE505, RISE509 and RISE511 the reads are closer to Y. pestis than to Y. pseudotuberculosis, and there are far more hits at low edit distances (RISE505 and RISE509 are shown in Figure 3). This is consistent with these reads originating from Y. pestis. Reads from the RISE392 sample instead have more hits at higher edit distances and have similar distances to both the Y. pestis and Y. pseudotuberculosis reference genomes. This suggests that RISE392 is neither Y. pestis nor Y. pseudotuberculosis, but a more distantly related species.(B) Distribution of the amount of reads mapping to the Y. pestis reference genome, at different edit distances. For each of the three investigated species (Y. pestis n = 10, Y. pseudotuberculosis n = 10, and Y. similis n = 5) several different sets of reads were mapped against the reference, and the number of reads matching at different edit distances was counted. For each edit distance the distribution of reads for each species is shown in the form of a boxplot. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(C) Ratio between the number of reads mapping to Y. pestis and the number of reads mapping to Y. pseudotuberculosis, for different edit distances, for three investigated species. Input data as in B. For each sample the ratio between the number of reads matching Y. pestis, and the number of reads matching Y. pseudotuberculosis was calculated, and the distribution of these ratios then shown in the form of a boxplot for each edit distance. These features were used to predict the taxonomy of unknown samples. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(D) Depth of coverage plots for the seven ancient Y. pestis samples mapped to the CO92 chromosome, pCD1, pMT1 and pPCP1. The RISE samples are ordered according to age where the oldest sample is the outermost histogram. Outer ring: Mappability (gray), genes (RNA: black, transposon: purple, positive strand: blue, negative strand: red), RISE509, RISE511, RISE00, RISE386, RISE139, RISE505 and RISE397 (blue). Depth histograms show sequence depth in 1 kb windows for the chromosome and 100 bp for the plasmids with a max of 5X depth for each ring. The plots were generated using Circos.
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4644222&req=5

figs2: Mapping Affinity, Related to Figure 3(A) Distribution of edit distance of high quality reads of known origin and the eight Yersinia associated samples. The investigated, known reads are from Y. pestis 620024 (0.PE7), Y. pestis D1982001 (1.IN2), Y. pseudo (IP32464) (from the clade closest to Y. pestis), and Y. similis (which is an outgroup to both Y. pestis and Y. pseudotuberculosis). For RISE00, RISE139, RISE386, RISE397, RISE505, RISE509 and RISE511 the reads are closer to Y. pestis than to Y. pseudotuberculosis, and there are far more hits at low edit distances (RISE505 and RISE509 are shown in Figure 3). This is consistent with these reads originating from Y. pestis. Reads from the RISE392 sample instead have more hits at higher edit distances and have similar distances to both the Y. pestis and Y. pseudotuberculosis reference genomes. This suggests that RISE392 is neither Y. pestis nor Y. pseudotuberculosis, but a more distantly related species.(B) Distribution of the amount of reads mapping to the Y. pestis reference genome, at different edit distances. For each of the three investigated species (Y. pestis n = 10, Y. pseudotuberculosis n = 10, and Y. similis n = 5) several different sets of reads were mapped against the reference, and the number of reads matching at different edit distances was counted. For each edit distance the distribution of reads for each species is shown in the form of a boxplot. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(C) Ratio between the number of reads mapping to Y. pestis and the number of reads mapping to Y. pseudotuberculosis, for different edit distances, for three investigated species. Input data as in B. For each sample the ratio between the number of reads matching Y. pestis, and the number of reads matching Y. pseudotuberculosis was calculated, and the distribution of these ratios then shown in the form of a boxplot for each edit distance. These features were used to predict the taxonomy of unknown samples. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(D) Depth of coverage plots for the seven ancient Y. pestis samples mapped to the CO92 chromosome, pCD1, pMT1 and pPCP1. The RISE samples are ordered according to age where the oldest sample is the outermost histogram. Outer ring: Mappability (gray), genes (RNA: black, transposon: purple, positive strand: blue, negative strand: red), RISE509, RISE511, RISE00, RISE386, RISE139, RISE505 and RISE397 (blue). Depth histograms show sequence depth in 1 kb windows for the chromosome and 100 bp for the plasmids with a max of 5X depth for each ring. The plots were generated using Circos.

Mentions: Besides applying standard precautions for working with ancient DNA (Willerslev and Cooper, 2005), the authenticity of our findings are supported by the following observations: (1) The Y. pestis sequences were identified in significant amounts in shotgun data from eight of 101 samples, showing that this finding is not due to a ubiquitous contaminant in our lab or in the reagents. Indeed, further analysis showed that one of these eight was most likely not Y. pestis. We also sequenced all negative DNA extraction controls and found no signs of Y. pestis DNA in these (Table S3). (2) Consistent with an ancient origin, the Y. pestis reads were highly fragmented, with average read lengths of 43–65 bp (Table S3) and also displayed clear signs of C-T deamination damage at the 5′ termini typical of ancient DNA (Figure 3, Figure S1). Because the plasmids are central for discriminating between Y. pestis and Y. pseudotuberculosis, we tested separately for DNA damage patterns for the chromosome and for each of the plasmids. For the seven samples, we observe similar patterns of DNA damage for chromosome and plasmid sequences (Figure 3, Figure S1). (3) We observe correlated DNA degradation patterns when comparing DNA degradation in the Y. pestis sequences and the human sequences from the host individual. Given that DNA decay can be described as a rate process (Allentoft et al., 2012), this suggests that the DNA molecules of the pathogen and the human host have a similar age (Figure 3, Figure S1, Table S3 and Supplemental Experimental Procedures). (4) Because of the high sequence similarity between Y. pestis and Y. pseudotuberculosis, we mapped all reads both to the Y. pestis CO92 and to the Y. pseudotuberculosis IP32953 reference genomes (Chain et al., 2004). Consistent with being Y. pestis, the seven investigated samples displayed more reads matching perfectly (edit distance = 0) toward Y. pestis (Figure 3, Figure S2). One sample (RISE392) was most likely not Y. pestis based on this criterion. (5) A naive Bayesian classifier trained on known genomes predicts the seven samples to be Y. pestis with 100% posterior probability, while RISE392 is predicted to have 0% probability of being Y. pestis (Figure S2, Table S3). (6) If the DNA was from other organisms than Y. pestis, we would expect the reads to be more frequently associated with either highly conserved or low-complexity regions. However, we find the reads to be distributed across the entire genome (Figure S2), and comparison of actual coverage versus the coverage that would be expected from read length distributions and mappability of the reference sequences are also in agreement for the seven samples (Figure 3). (7) In a maximum likelihood phylogeny, the recovered Y. pestis genomic sequences of RISE505 and RISE509 are clearly within the Y. pestis clade and basal to all contemporary Y. pestis strains (Figure 4) (see below).


Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago.

Rasmussen S, Allentoft ME, Nielsen K, Orlando L, Sikora M, Sjögren KG, Pedersen AG, Schubert M, Van Dam A, Kapel CM, Nielsen HB, Brunak S, Avetisyan P, Epimakhov A, Khalyapin MV, Gnuni A, Kriiska A, Lasak I, Metspalu M, Moiseyev V, Gromov A, Pokutta D, Saag L, Varul L, Yepiskoposyan L, Sicheritz-Pontén T, Foley RA, Lahr MM, Nielsen R, Kristiansen K, Willerslev E - Cell (2015)

Mapping Affinity, Related to Figure 3(A) Distribution of edit distance of high quality reads of known origin and the eight Yersinia associated samples. The investigated, known reads are from Y. pestis 620024 (0.PE7), Y. pestis D1982001 (1.IN2), Y. pseudo (IP32464) (from the clade closest to Y. pestis), and Y. similis (which is an outgroup to both Y. pestis and Y. pseudotuberculosis). For RISE00, RISE139, RISE386, RISE397, RISE505, RISE509 and RISE511 the reads are closer to Y. pestis than to Y. pseudotuberculosis, and there are far more hits at low edit distances (RISE505 and RISE509 are shown in Figure 3). This is consistent with these reads originating from Y. pestis. Reads from the RISE392 sample instead have more hits at higher edit distances and have similar distances to both the Y. pestis and Y. pseudotuberculosis reference genomes. This suggests that RISE392 is neither Y. pestis nor Y. pseudotuberculosis, but a more distantly related species.(B) Distribution of the amount of reads mapping to the Y. pestis reference genome, at different edit distances. For each of the three investigated species (Y. pestis n = 10, Y. pseudotuberculosis n = 10, and Y. similis n = 5) several different sets of reads were mapped against the reference, and the number of reads matching at different edit distances was counted. For each edit distance the distribution of reads for each species is shown in the form of a boxplot. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(C) Ratio between the number of reads mapping to Y. pestis and the number of reads mapping to Y. pseudotuberculosis, for different edit distances, for three investigated species. Input data as in B. For each sample the ratio between the number of reads matching Y. pestis, and the number of reads matching Y. pseudotuberculosis was calculated, and the distribution of these ratios then shown in the form of a boxplot for each edit distance. These features were used to predict the taxonomy of unknown samples. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(D) Depth of coverage plots for the seven ancient Y. pestis samples mapped to the CO92 chromosome, pCD1, pMT1 and pPCP1. The RISE samples are ordered according to age where the oldest sample is the outermost histogram. Outer ring: Mappability (gray), genes (RNA: black, transposon: purple, positive strand: blue, negative strand: red), RISE509, RISE511, RISE00, RISE386, RISE139, RISE505 and RISE397 (blue). Depth histograms show sequence depth in 1 kb windows for the chromosome and 100 bp for the plasmids with a max of 5X depth for each ring. The plots were generated using Circos.
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4644222&req=5

figs2: Mapping Affinity, Related to Figure 3(A) Distribution of edit distance of high quality reads of known origin and the eight Yersinia associated samples. The investigated, known reads are from Y. pestis 620024 (0.PE7), Y. pestis D1982001 (1.IN2), Y. pseudo (IP32464) (from the clade closest to Y. pestis), and Y. similis (which is an outgroup to both Y. pestis and Y. pseudotuberculosis). For RISE00, RISE139, RISE386, RISE397, RISE505, RISE509 and RISE511 the reads are closer to Y. pestis than to Y. pseudotuberculosis, and there are far more hits at low edit distances (RISE505 and RISE509 are shown in Figure 3). This is consistent with these reads originating from Y. pestis. Reads from the RISE392 sample instead have more hits at higher edit distances and have similar distances to both the Y. pestis and Y. pseudotuberculosis reference genomes. This suggests that RISE392 is neither Y. pestis nor Y. pseudotuberculosis, but a more distantly related species.(B) Distribution of the amount of reads mapping to the Y. pestis reference genome, at different edit distances. For each of the three investigated species (Y. pestis n = 10, Y. pseudotuberculosis n = 10, and Y. similis n = 5) several different sets of reads were mapped against the reference, and the number of reads matching at different edit distances was counted. For each edit distance the distribution of reads for each species is shown in the form of a boxplot. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(C) Ratio between the number of reads mapping to Y. pestis and the number of reads mapping to Y. pseudotuberculosis, for different edit distances, for three investigated species. Input data as in B. For each sample the ratio between the number of reads matching Y. pestis, and the number of reads matching Y. pseudotuberculosis was calculated, and the distribution of these ratios then shown in the form of a boxplot for each edit distance. These features were used to predict the taxonomy of unknown samples. The lower and upper hinges of the boxes correspond to the 25th and 75th percentiles, the whiskers represent the 1.5 inter-quartile range (IQR) extending from the hinges, and the dots represent outliers from these.(D) Depth of coverage plots for the seven ancient Y. pestis samples mapped to the CO92 chromosome, pCD1, pMT1 and pPCP1. The RISE samples are ordered according to age where the oldest sample is the outermost histogram. Outer ring: Mappability (gray), genes (RNA: black, transposon: purple, positive strand: blue, negative strand: red), RISE509, RISE511, RISE00, RISE386, RISE139, RISE505 and RISE397 (blue). Depth histograms show sequence depth in 1 kb windows for the chromosome and 100 bp for the plasmids with a max of 5X depth for each ring. The plots were generated using Circos.
Mentions: Besides applying standard precautions for working with ancient DNA (Willerslev and Cooper, 2005), the authenticity of our findings are supported by the following observations: (1) The Y. pestis sequences were identified in significant amounts in shotgun data from eight of 101 samples, showing that this finding is not due to a ubiquitous contaminant in our lab or in the reagents. Indeed, further analysis showed that one of these eight was most likely not Y. pestis. We also sequenced all negative DNA extraction controls and found no signs of Y. pestis DNA in these (Table S3). (2) Consistent with an ancient origin, the Y. pestis reads were highly fragmented, with average read lengths of 43–65 bp (Table S3) and also displayed clear signs of C-T deamination damage at the 5′ termini typical of ancient DNA (Figure 3, Figure S1). Because the plasmids are central for discriminating between Y. pestis and Y. pseudotuberculosis, we tested separately for DNA damage patterns for the chromosome and for each of the plasmids. For the seven samples, we observe similar patterns of DNA damage for chromosome and plasmid sequences (Figure 3, Figure S1). (3) We observe correlated DNA degradation patterns when comparing DNA degradation in the Y. pestis sequences and the human sequences from the host individual. Given that DNA decay can be described as a rate process (Allentoft et al., 2012), this suggests that the DNA molecules of the pathogen and the human host have a similar age (Figure 3, Figure S1, Table S3 and Supplemental Experimental Procedures). (4) Because of the high sequence similarity between Y. pestis and Y. pseudotuberculosis, we mapped all reads both to the Y. pestis CO92 and to the Y. pseudotuberculosis IP32953 reference genomes (Chain et al., 2004). Consistent with being Y. pestis, the seven investigated samples displayed more reads matching perfectly (edit distance = 0) toward Y. pestis (Figure 3, Figure S2). One sample (RISE392) was most likely not Y. pestis based on this criterion. (5) A naive Bayesian classifier trained on known genomes predicts the seven samples to be Y. pestis with 100% posterior probability, while RISE392 is predicted to have 0% probability of being Y. pestis (Figure S2, Table S3). (6) If the DNA was from other organisms than Y. pestis, we would expect the reads to be more frequently associated with either highly conserved or low-complexity regions. However, we find the reads to be distributed across the entire genome (Figure S2), and comparison of actual coverage versus the coverage that would be expected from read length distributions and mappability of the reference sequences are also in agreement for the seven samples (Figure 3). (7) In a maximum likelihood phylogeny, the recovered Y. pestis genomic sequences of RISE505 and RISE509 are clearly within the Y. pestis clade and basal to all contemporary Y. pestis strains (Figure 4) (see below).

Bottom Line: How and when it originated remains contentious.We also identify a temporal sequence of genetic changes that lead to increased virulence and the emergence of the bubonic plague.Our results show that plague infection was endemic in the human populations of Eurasia at least 3,000 years before any historical recordings of pandemics.

View Article: PubMed Central - PubMed

Affiliation: Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, Building 208, 2800 Kongens Lyngby, Denmark.

Show MeSH
Related in: MedlinePlus