Limits...
Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing.

Rocher S, Jean M, Castonguay Y, Belzile F - PLoS ONE (2015)

Bottom Line: A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations.About 60% had a significant match on the Medicago truncatula syntenic genome.Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids.

View Article: PubMed Central - PubMed

Affiliation: Centre de Recherche et de Développement sur les Sols et les Grandes Cultures, Agriculture et agroalimentaire Canada, Quebec City (QC), Canada.

ABSTRACT
Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids.

No MeSH data available.


Related in: MedlinePlus

Example of comparison of GBS and 454 sequencing of TP61949 in eight plant samples.A) GBS and 454 read counts of each allele (A1/A2); B) predicted tetraploid allelic ratios (convergent ratios in green and discordant ratios in red); C) bi-allelic predicted genotype (A1, A2 and H) before genotype-level filtration and D) after genotype-level filtration of GBS data for minimum read counts (11 reads for homozygous genotypes, 2 reads of each allele for heterozygous genotypes, 0.1 as minimum minor allele frequency). Genotype calls showing concordance (green), discordance (red for GBS homozygotes and orange for GBS heterozygotes) with both sequencing methods or that are missing (white) before and after genotype-level filtration for minimum read counts. A complete representation of validation results for 14 SNP loci in eight plant samples is provided in S3 Fig.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4482585&req=5

pone.0131918.g004: Example of comparison of GBS and 454 sequencing of TP61949 in eight plant samples.A) GBS and 454 read counts of each allele (A1/A2); B) predicted tetraploid allelic ratios (convergent ratios in green and discordant ratios in red); C) bi-allelic predicted genotype (A1, A2 and H) before genotype-level filtration and D) after genotype-level filtration of GBS data for minimum read counts (11 reads for homozygous genotypes, 2 reads of each allele for heterozygous genotypes, 0.1 as minimum minor allele frequency). Genotype calls showing concordance (green), discordance (red for GBS homozygotes and orange for GBS heterozygotes) with both sequencing methods or that are missing (white) before and after genotype-level filtration for minimum read counts. A complete representation of validation results for 14 SNP loci in eight plant samples is provided in S3 Fig.

Mentions: Genotypes were called based either on tetraploid or diploid allelic ratios of read counts of SNP loci determined with the two sequencing methods, as illustrated for TP61949 in Fig 4. Genotypes calls obtained with both approaches were either concordant or discordant. Half of the genotypes called based on a tetraploid allelic ratio obtained with GBS and 454 sequencing concurred (Table 6). Of these concordant calls, 80% were homozygous (4/0 and 0/4) and 20% were heterozygous (3/1, 2/2 or 1/3). Discordant tetraploid allelic ratios were on average supported by lower GBS RC ( = 58) than concordant allelic ratios ( = 72) (S3 Fig).


Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing.

Rocher S, Jean M, Castonguay Y, Belzile F - PLoS ONE (2015)

Example of comparison of GBS and 454 sequencing of TP61949 in eight plant samples.A) GBS and 454 read counts of each allele (A1/A2); B) predicted tetraploid allelic ratios (convergent ratios in green and discordant ratios in red); C) bi-allelic predicted genotype (A1, A2 and H) before genotype-level filtration and D) after genotype-level filtration of GBS data for minimum read counts (11 reads for homozygous genotypes, 2 reads of each allele for heterozygous genotypes, 0.1 as minimum minor allele frequency). Genotype calls showing concordance (green), discordance (red for GBS homozygotes and orange for GBS heterozygotes) with both sequencing methods or that are missing (white) before and after genotype-level filtration for minimum read counts. A complete representation of validation results for 14 SNP loci in eight plant samples is provided in S3 Fig.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4482585&req=5

pone.0131918.g004: Example of comparison of GBS and 454 sequencing of TP61949 in eight plant samples.A) GBS and 454 read counts of each allele (A1/A2); B) predicted tetraploid allelic ratios (convergent ratios in green and discordant ratios in red); C) bi-allelic predicted genotype (A1, A2 and H) before genotype-level filtration and D) after genotype-level filtration of GBS data for minimum read counts (11 reads for homozygous genotypes, 2 reads of each allele for heterozygous genotypes, 0.1 as minimum minor allele frequency). Genotype calls showing concordance (green), discordance (red for GBS homozygotes and orange for GBS heterozygotes) with both sequencing methods or that are missing (white) before and after genotype-level filtration for minimum read counts. A complete representation of validation results for 14 SNP loci in eight plant samples is provided in S3 Fig.
Mentions: Genotypes were called based either on tetraploid or diploid allelic ratios of read counts of SNP loci determined with the two sequencing methods, as illustrated for TP61949 in Fig 4. Genotypes calls obtained with both approaches were either concordant or discordant. Half of the genotypes called based on a tetraploid allelic ratio obtained with GBS and 454 sequencing concurred (Table 6). Of these concordant calls, 80% were homozygous (4/0 and 0/4) and 20% were heterozygous (3/1, 2/2 or 1/3). Discordant tetraploid allelic ratios were on average supported by lower GBS RC ( = 58) than concordant allelic ratios ( = 72) (S3 Fig).

Bottom Line: A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations.About 60% had a significant match on the Medicago truncatula syntenic genome.Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids.

View Article: PubMed Central - PubMed

Affiliation: Centre de Recherche et de Développement sur les Sols et les Grandes Cultures, Agriculture et agroalimentaire Canada, Quebec City (QC), Canada.

ABSTRACT
Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids.

No MeSH data available.


Related in: MedlinePlus