Limits...
The inherent mutational tolerance and antigenic evolvability of influenza hemagglutinin.

Thyagarajan B, Bloom JD - Elife (2014)

Bottom Line: We used deep mutational scanning to examine the extent to which a high inherent mutational tolerance contributes to this antigenic evolvability.These data enable us to infer the preference for each amino acid at each site in hemagglutinin.These inferences are consistent with existing knowledge about the protein's structure and function, and can be used to create a model that describes hemagglutinin's evolution far better than existing phylogenetic models.

View Article: PubMed Central - PubMed

Affiliation: Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States.

ABSTRACT
Influenza is notable for its evolutionary capacity to escape immunity targeting the viral hemagglutinin. We used deep mutational scanning to examine the extent to which a high inherent mutational tolerance contributes to this antigenic evolvability. We created mutant viruses that incorporate most of the ≈10(4) amino-acid mutations to hemagglutinin from A/WSN/1933 (H1N1) influenza. After passaging these viruses in tissue culture to select for functional variants, we used deep sequencing to quantify mutation frequencies before and after selection. These data enable us to infer the preference for each amino acid at each site in hemagglutinin. These inferences are consistent with existing knowledge about the protein's structure and function, and can be used to create a model that describes hemagglutinin's evolution far better than existing phylogenetic models. We show that hemagglutinin has a high inherent tolerance for mutations at antigenic sites, suggesting that this is one factor contributing to influenza's antigenic evolution.

Show MeSH

Related in: MedlinePlus

Correlations among the amino-acid preferences inferred using data from the individual biological replicates.(A) The preferences from two technical repeats of the sample preparation and deep sequencing of biological replicate #1 are highly correlated. (B)–(D) The preferences from the three biological replicates are substantially but imperfectly correlated. Overall, these results indicate that technical variation in sample preparation and sequencing is minimal, but that there is substantial variation between biological replicates due to stochastic differences in which mutant viruses predominate during the initial reverse-genetics step. The Pearson correlation coefficient (R) and associated p-value are shown in the upper-left corner of each plot. The data and code used to create this figure are available via http://jbloom.github.io/mapmuts/example_WSN_HA_2014Analysis.html; these plots are the files correlations/replicate_1_vs_replicate_1_repeat.pdf, correlations/replicate_1_vs_replicate_2.pdf, correlations/replicate_1_vs_replicate_3.pdf, and correlations/replicate_2_vs_replicate_3.pdf described therein.DOI:http://dx.doi.org/10.7554/eLife.03300.014
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4109307&req=5

fig6: Correlations among the amino-acid preferences inferred using data from the individual biological replicates.(A) The preferences from two technical repeats of the sample preparation and deep sequencing of biological replicate #1 are highly correlated. (B)–(D) The preferences from the three biological replicates are substantially but imperfectly correlated. Overall, these results indicate that technical variation in sample preparation and sequencing is minimal, but that there is substantial variation between biological replicates due to stochastic differences in which mutant viruses predominate during the initial reverse-genetics step. The Pearson correlation coefficient (R) and associated p-value are shown in the upper-left corner of each plot. The data and code used to create this figure are available via http://jbloom.github.io/mapmuts/example_WSN_HA_2014Analysis.html; these plots are the files correlations/replicate_1_vs_replicate_1_repeat.pdf, correlations/replicate_1_vs_replicate_2.pdf, correlations/replicate_1_vs_replicate_3.pdf, and correlations/replicate_2_vs_replicate_3.pdf described therein.DOI:http://dx.doi.org/10.7554/eLife.03300.014

Mentions: Despite the general concordance between the inferred amino-acid preferences and existing knowledge, it is important to quantify the experimental error associated with the deep mutational scanning. We sought to quantify two factors: technical variation due to inaccuracies and statistical limitations during Illumina sample preparation and deep sequencing, and biological variation due to stochasticity in the viral mutants that were generated and enriched during each replicate of the experiment. Figure 6A shows the correlation between biological replicate #1 and a technical repeat of the Illumina sample preparation and deep sequencing for this biological replicate. There is a very high correlation between the preferences inferred from these two repeats, indicating that technical variation has only a very minor influence on the final inferred amino-acid preferences. Figure 6B–D show the correlation among the three different biological replicates. Although the biological replicates are substantially correlated, there is also clear variation. Most of this variation is attributable to amino acids which in one replicate are inferred to have preferences near the a priori expectation of 0.05 (there are 20 amino acids, which in the absence of data are all initially assumed to have an equal preference of ), but in another replicate are inferred to have a much higher or lower preference. Such variation arises because the mutant viruses for each biological replicate only sample about 50% of the possible codon mutations (see previous section), meaning that there is little data for some mutations in any given replicate. Fortunately, combining the three biological replicates greatly increases the coverage of possible mutations (see previous section). Therefore, inferences made from the combined data (as in Figure 5) should be substantially more accurate than inferences from any of the individual replicates. This idea is supported by the results below, which quantify the extent to which the inferred preferences accurately describe natural HA evolution.10.7554/eLife.03300.014Figure 6.Correlations among the amino-acid preferences inferred using data from the individual biological replicates.


The inherent mutational tolerance and antigenic evolvability of influenza hemagglutinin.

Thyagarajan B, Bloom JD - Elife (2014)

Correlations among the amino-acid preferences inferred using data from the individual biological replicates.(A) The preferences from two technical repeats of the sample preparation and deep sequencing of biological replicate #1 are highly correlated. (B)–(D) The preferences from the three biological replicates are substantially but imperfectly correlated. Overall, these results indicate that technical variation in sample preparation and sequencing is minimal, but that there is substantial variation between biological replicates due to stochastic differences in which mutant viruses predominate during the initial reverse-genetics step. The Pearson correlation coefficient (R) and associated p-value are shown in the upper-left corner of each plot. The data and code used to create this figure are available via http://jbloom.github.io/mapmuts/example_WSN_HA_2014Analysis.html; these plots are the files correlations/replicate_1_vs_replicate_1_repeat.pdf, correlations/replicate_1_vs_replicate_2.pdf, correlations/replicate_1_vs_replicate_3.pdf, and correlations/replicate_2_vs_replicate_3.pdf described therein.DOI:http://dx.doi.org/10.7554/eLife.03300.014
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4109307&req=5

fig6: Correlations among the amino-acid preferences inferred using data from the individual biological replicates.(A) The preferences from two technical repeats of the sample preparation and deep sequencing of biological replicate #1 are highly correlated. (B)–(D) The preferences from the three biological replicates are substantially but imperfectly correlated. Overall, these results indicate that technical variation in sample preparation and sequencing is minimal, but that there is substantial variation between biological replicates due to stochastic differences in which mutant viruses predominate during the initial reverse-genetics step. The Pearson correlation coefficient (R) and associated p-value are shown in the upper-left corner of each plot. The data and code used to create this figure are available via http://jbloom.github.io/mapmuts/example_WSN_HA_2014Analysis.html; these plots are the files correlations/replicate_1_vs_replicate_1_repeat.pdf, correlations/replicate_1_vs_replicate_2.pdf, correlations/replicate_1_vs_replicate_3.pdf, and correlations/replicate_2_vs_replicate_3.pdf described therein.DOI:http://dx.doi.org/10.7554/eLife.03300.014
Mentions: Despite the general concordance between the inferred amino-acid preferences and existing knowledge, it is important to quantify the experimental error associated with the deep mutational scanning. We sought to quantify two factors: technical variation due to inaccuracies and statistical limitations during Illumina sample preparation and deep sequencing, and biological variation due to stochasticity in the viral mutants that were generated and enriched during each replicate of the experiment. Figure 6A shows the correlation between biological replicate #1 and a technical repeat of the Illumina sample preparation and deep sequencing for this biological replicate. There is a very high correlation between the preferences inferred from these two repeats, indicating that technical variation has only a very minor influence on the final inferred amino-acid preferences. Figure 6B–D show the correlation among the three different biological replicates. Although the biological replicates are substantially correlated, there is also clear variation. Most of this variation is attributable to amino acids which in one replicate are inferred to have preferences near the a priori expectation of 0.05 (there are 20 amino acids, which in the absence of data are all initially assumed to have an equal preference of ), but in another replicate are inferred to have a much higher or lower preference. Such variation arises because the mutant viruses for each biological replicate only sample about 50% of the possible codon mutations (see previous section), meaning that there is little data for some mutations in any given replicate. Fortunately, combining the three biological replicates greatly increases the coverage of possible mutations (see previous section). Therefore, inferences made from the combined data (as in Figure 5) should be substantially more accurate than inferences from any of the individual replicates. This idea is supported by the results below, which quantify the extent to which the inferred preferences accurately describe natural HA evolution.10.7554/eLife.03300.014Figure 6.Correlations among the amino-acid preferences inferred using data from the individual biological replicates.

Bottom Line: We used deep mutational scanning to examine the extent to which a high inherent mutational tolerance contributes to this antigenic evolvability.These data enable us to infer the preference for each amino acid at each site in hemagglutinin.These inferences are consistent with existing knowledge about the protein's structure and function, and can be used to create a model that describes hemagglutinin's evolution far better than existing phylogenetic models.

View Article: PubMed Central - PubMed

Affiliation: Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States.

ABSTRACT
Influenza is notable for its evolutionary capacity to escape immunity targeting the viral hemagglutinin. We used deep mutational scanning to examine the extent to which a high inherent mutational tolerance contributes to this antigenic evolvability. We created mutant viruses that incorporate most of the ≈10(4) amino-acid mutations to hemagglutinin from A/WSN/1933 (H1N1) influenza. After passaging these viruses in tissue culture to select for functional variants, we used deep sequencing to quantify mutation frequencies before and after selection. These data enable us to infer the preference for each amino acid at each site in hemagglutinin. These inferences are consistent with existing knowledge about the protein's structure and function, and can be used to create a model that describes hemagglutinin's evolution far better than existing phylogenetic models. We show that hemagglutinin has a high inherent tolerance for mutations at antigenic sites, suggesting that this is one factor contributing to influenza's antigenic evolution.

Show MeSH
Related in: MedlinePlus