Limits...
Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa.

Bassil NV, Davis TM, Zhang H, Ficklin S, Mittmann M, Webster T, Mahoney L, Wood D, Alperin ES, Rosyara UR, Koehorst-Vanc Putten H, Monfort A, Sargent DJ, Amaya I, Denoyes B, Bianco L, van Dijk T, Pirani A, Iezzoni A, Main D, Peace C, Yang Y, Whitaker V, Verma S, Bellon L, Brew F, Herrera R, van de Weg E - BMC Genomics (2015)

Bottom Line: Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%).The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies.This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.

View Article: PubMed Central - PubMed

Affiliation: USDA-ARS, NCGR, Corvallis, OR, USA. nahla.bassil@ars.usda.gov.

ABSTRACT

Background: A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array.

Results: About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM.

Conclusions: The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.

Show MeSH

Related in: MedlinePlus

Apparent polyploid levels based on comparing simulated to observed cluster locations. Points for plots in upper row are the simulated cluster center locations for the given genotype where A = (#A_alleles * Intensity_per_A_allele) + Background_Intensity and B = (#B_alleles * Intensity_per_B_allele) + Background_Intensity. Intensity_per_A_allele = Intensity_per_B_allele = Background_Intensity = 100. The #A_alleles and #B_alleles are the counts of each allele in the given genotype. Points for plots in lower row are the observed contrast vs size values for each sample. Each column in 4A, 4B, and 4C is a SNP locus at a different polyploid level. An “X” is drawn over subgenome genotypes to indicate effective absence. A vertical bar is drawn at contrast = zero. A: The 2x/diploid-like cluster pattern. Alleles segregate in one subgenome and are effectively absent in the other three subgenomes. The AB cluster is centered ~ contrast = zero and the homozygous BB and AA genotype clusters have negative and positive contrast values, respectively, in the simulated and observed cluster patterns. B: The 4x/allo-tetraploid-like cluster pattern. Alleles segregate in one subgenome, are fixed in another subgenome, and are effectively absent in the other two subgenomes. One of the genotype clusters AABB, is centered near contrast = zero and the other two genotype clusters are offset to negative contrast values in both the simulated and observed cluster patterns, and correspond to one subgenome being fixed for the B-allele. C: The 8x/allo-octoploid-like pattern. Alleles segregate in one subgenome, and are fixed in at least two other subgenomes. Simulation is shown for allo-octoploid genotypes, where alleles segregate in one subgenome and three other subgenomes are present and fixed for the same allele. The pattern is the same as that for the 4x locus, except that all genotype clusters are offset to the positive (subgenomes are fixed for the A-allele) or negative (subgenomes are fixed for the B-allele, shown) contrast values.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4374422&req=5

Fig4: Apparent polyploid levels based on comparing simulated to observed cluster locations. Points for plots in upper row are the simulated cluster center locations for the given genotype where A = (#A_alleles * Intensity_per_A_allele) + Background_Intensity and B = (#B_alleles * Intensity_per_B_allele) + Background_Intensity. Intensity_per_A_allele = Intensity_per_B_allele = Background_Intensity = 100. The #A_alleles and #B_alleles are the counts of each allele in the given genotype. Points for plots in lower row are the observed contrast vs size values for each sample. Each column in 4A, 4B, and 4C is a SNP locus at a different polyploid level. An “X” is drawn over subgenome genotypes to indicate effective absence. A vertical bar is drawn at contrast = zero. A: The 2x/diploid-like cluster pattern. Alleles segregate in one subgenome and are effectively absent in the other three subgenomes. The AB cluster is centered ~ contrast = zero and the homozygous BB and AA genotype clusters have negative and positive contrast values, respectively, in the simulated and observed cluster patterns. B: The 4x/allo-tetraploid-like cluster pattern. Alleles segregate in one subgenome, are fixed in another subgenome, and are effectively absent in the other two subgenomes. One of the genotype clusters AABB, is centered near contrast = zero and the other two genotype clusters are offset to negative contrast values in both the simulated and observed cluster patterns, and correspond to one subgenome being fixed for the B-allele. C: The 8x/allo-octoploid-like pattern. Alleles segregate in one subgenome, and are fixed in at least two other subgenomes. Simulation is shown for allo-octoploid genotypes, where alleles segregate in one subgenome and three other subgenomes are present and fixed for the same allele. The pattern is the same as that for the 4x locus, except that all genotype clusters are offset to the positive (subgenomes are fixed for the A-allele) or negative (subgenomes are fixed for the B-allele, shown) contrast values.

Mentions: The Homozygous Ratio Offset (HomRO) is a measure that allows automated discrimination between SNPs that display a diploid-like cluster (Figure 4A) as opposed to a polyploid-like cluster (Figure 4B-C). It measures the displacement from 0 contrast of the homozygous cluster closest to that value. From simulation results, SNPs in a diploid organism are expected to have a positive HomRO value and SNPs in a polyploid organism to have a negative (or near 0) HomRO value. Based on simulation, we used a HomRO value ≥0.3 to classify SNPs as clustering like a diploid, and a HomRO value <0.3 to classify SNPs clustering like a polyploid.Figure 4


Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa.

Bassil NV, Davis TM, Zhang H, Ficklin S, Mittmann M, Webster T, Mahoney L, Wood D, Alperin ES, Rosyara UR, Koehorst-Vanc Putten H, Monfort A, Sargent DJ, Amaya I, Denoyes B, Bianco L, van Dijk T, Pirani A, Iezzoni A, Main D, Peace C, Yang Y, Whitaker V, Verma S, Bellon L, Brew F, Herrera R, van de Weg E - BMC Genomics (2015)

Apparent polyploid levels based on comparing simulated to observed cluster locations. Points for plots in upper row are the simulated cluster center locations for the given genotype where A = (#A_alleles * Intensity_per_A_allele) + Background_Intensity and B = (#B_alleles * Intensity_per_B_allele) + Background_Intensity. Intensity_per_A_allele = Intensity_per_B_allele = Background_Intensity = 100. The #A_alleles and #B_alleles are the counts of each allele in the given genotype. Points for plots in lower row are the observed contrast vs size values for each sample. Each column in 4A, 4B, and 4C is a SNP locus at a different polyploid level. An “X” is drawn over subgenome genotypes to indicate effective absence. A vertical bar is drawn at contrast = zero. A: The 2x/diploid-like cluster pattern. Alleles segregate in one subgenome and are effectively absent in the other three subgenomes. The AB cluster is centered ~ contrast = zero and the homozygous BB and AA genotype clusters have negative and positive contrast values, respectively, in the simulated and observed cluster patterns. B: The 4x/allo-tetraploid-like cluster pattern. Alleles segregate in one subgenome, are fixed in another subgenome, and are effectively absent in the other two subgenomes. One of the genotype clusters AABB, is centered near contrast = zero and the other two genotype clusters are offset to negative contrast values in both the simulated and observed cluster patterns, and correspond to one subgenome being fixed for the B-allele. C: The 8x/allo-octoploid-like pattern. Alleles segregate in one subgenome, and are fixed in at least two other subgenomes. Simulation is shown for allo-octoploid genotypes, where alleles segregate in one subgenome and three other subgenomes are present and fixed for the same allele. The pattern is the same as that for the 4x locus, except that all genotype clusters are offset to the positive (subgenomes are fixed for the A-allele) or negative (subgenomes are fixed for the B-allele, shown) contrast values.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4374422&req=5

Fig4: Apparent polyploid levels based on comparing simulated to observed cluster locations. Points for plots in upper row are the simulated cluster center locations for the given genotype where A = (#A_alleles * Intensity_per_A_allele) + Background_Intensity and B = (#B_alleles * Intensity_per_B_allele) + Background_Intensity. Intensity_per_A_allele = Intensity_per_B_allele = Background_Intensity = 100. The #A_alleles and #B_alleles are the counts of each allele in the given genotype. Points for plots in lower row are the observed contrast vs size values for each sample. Each column in 4A, 4B, and 4C is a SNP locus at a different polyploid level. An “X” is drawn over subgenome genotypes to indicate effective absence. A vertical bar is drawn at contrast = zero. A: The 2x/diploid-like cluster pattern. Alleles segregate in one subgenome and are effectively absent in the other three subgenomes. The AB cluster is centered ~ contrast = zero and the homozygous BB and AA genotype clusters have negative and positive contrast values, respectively, in the simulated and observed cluster patterns. B: The 4x/allo-tetraploid-like cluster pattern. Alleles segregate in one subgenome, are fixed in another subgenome, and are effectively absent in the other two subgenomes. One of the genotype clusters AABB, is centered near contrast = zero and the other two genotype clusters are offset to negative contrast values in both the simulated and observed cluster patterns, and correspond to one subgenome being fixed for the B-allele. C: The 8x/allo-octoploid-like pattern. Alleles segregate in one subgenome, and are fixed in at least two other subgenomes. Simulation is shown for allo-octoploid genotypes, where alleles segregate in one subgenome and three other subgenomes are present and fixed for the same allele. The pattern is the same as that for the 4x locus, except that all genotype clusters are offset to the positive (subgenomes are fixed for the A-allele) or negative (subgenomes are fixed for the B-allele, shown) contrast values.
Mentions: The Homozygous Ratio Offset (HomRO) is a measure that allows automated discrimination between SNPs that display a diploid-like cluster (Figure 4A) as opposed to a polyploid-like cluster (Figure 4B-C). It measures the displacement from 0 contrast of the homozygous cluster closest to that value. From simulation results, SNPs in a diploid organism are expected to have a positive HomRO value and SNPs in a polyploid organism to have a negative (or near 0) HomRO value. Based on simulation, we used a HomRO value ≥0.3 to classify SNPs as clustering like a diploid, and a HomRO value <0.3 to classify SNPs clustering like a polyploid.Figure 4

Bottom Line: Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%).The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies.This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.

View Article: PubMed Central - PubMed

Affiliation: USDA-ARS, NCGR, Corvallis, OR, USA. nahla.bassil@ars.usda.gov.

ABSTRACT

Background: A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array.

Results: About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM.

Conclusions: The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.

Show MeSH
Related in: MedlinePlus