Limits...
High-throughput detection of induced mutations and natural variation using KeyPoint technology.

Rigola D, van Oeveren J, Janssen A, Bonné A, Schneiders H, van der Poel HJ, van Orsouw NJ, Hogers RC, de Both MT, van Eijk MJ - PLoS ONE (2009)

Bottom Line: We present KeyPoint technology, a high-throughput mutation/polymorphism discovery technique based on massive parallel sequencing of target genes amplified from mutant or natural populations.We show the power of KeyPoint by identifying two mutants in the tomato eIF4E gene based on screening more than 3000 M2 families in a single GS FLX sequencing run, and discovery of six haplotypes of tomato eIF4E gene by re-sequencing three amplicons in a subset of 92 tomato lines from the EU-SOL core collection.We propose KeyPoint technology as a broadly applicable amplicon sequencing approach to screen mutant populations or germplasm collections for identification of (novel) allelic variation in a high-throughput fashion.

View Article: PubMed Central - PubMed

Affiliation: Keygene NV, Wageningen, The Netherlands. diana.rigola@keygene.com

ABSTRACT
Reverse genetics approaches rely on the detection of sequence alterations in target genes to identify allelic variants among mutant or natural populations. Current (pre-) screening methods such as TILLING and EcoTILLING are based on the detection of single base mismatches in heteroduplexes using endonucleases such as CEL 1. However, there are drawbacks in the use of endonucleases due to their relatively poor cleavage efficiency and exonuclease activity. Moreover, pre-screening methods do not reveal information about the nature of sequence changes and their possible impact on gene function. We present KeyPoint technology, a high-throughput mutation/polymorphism discovery technique based on massive parallel sequencing of target genes amplified from mutant or natural populations. KeyPoint combines multi-dimensional pooling of large numbers of individual DNA samples and the use of sample identification tags ("sample barcoding") with next-generation sequencing technology. We show the power of KeyPoint by identifying two mutants in the tomato eIF4E gene based on screening more than 3000 M2 families in a single GS FLX sequencing run, and discovery of six haplotypes of tomato eIF4E gene by re-sequencing three amplicons in a subset of 92 tomato lines from the EU-SOL core collection. We propose KeyPoint technology as a broadly applicable amplicon sequencing approach to screen mutant populations or germplasm collections for identification of (novel) allelic variation in a high-throughput fashion.

Show MeSH

Related in: MedlinePlus

Results KeyPoint analysis mutant population.The top panel shows numbers of G to A (position 221) and C to T (position 170) sequence deviations compared to the wild type sequence observed in each of the 3D pools in a subset of nucleotide positions of the SleIF4E amplicon 1. The total number of observed sequence deviations and calculated average error rates are shown at the right hand side. The bottom panel shows corresponding P values of false positives for each X, Y and Z pool. Total numbers of pools surpassing significance thresholds P<0.001, P<0.01 and P<0.05 are shown at the bottom right. The complete analysis of all nucleotide positions are shown in figure S4.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2654077&req=5

pone-0004761-g003: Results KeyPoint analysis mutant population.The top panel shows numbers of G to A (position 221) and C to T (position 170) sequence deviations compared to the wild type sequence observed in each of the 3D pools in a subset of nucleotide positions of the SleIF4E amplicon 1. The total number of observed sequence deviations and calculated average error rates are shown at the right hand side. The bottom panel shows corresponding P values of false positives for each X, Y and Z pool. Total numbers of pools surpassing significance thresholds P<0.001, P<0.01 and P<0.05 are shown at the bottom right. The complete analysis of all nucleotide positions are shown in figure S4.

Mentions: A total of 15,000 M2 plants representing 3008 M2 families were screened for EMS-induced mutations in exon 1 of the SleIF4E gene based on amplification from 28 3D pools (12X, 8Y and 8Z). A total of 667,864 high-quality sequence reads with an average read length of 254 bases were obtained from a single GS FLX run. In the pre-processing step (Fig. 2), sample identification tags could be assigned with confidence to a total of 580,471 reads (87%, Table 1). The remaining 13% of reads contained one or more deviations in the sample identification tag sequences, contained concatamers or were reads shorter than 100 bases, and were excluded from further analysis. Successfully trimmed and tagged reads were taken into the mutation/polymorphism mining step starting with mapping them onto the reference sequence and followed by generating pair-wise sequence alignments (Fig. 2). Next, the numbers of C→T and G→A changes from the wild type sequence were counted for each position per pool (Fig. 3) and the probabilities that they represent true EMS mutations were calculated taking into account their distribution across the 3D sample pools. At significance threshold P<0.01, two mutations were identified: a C→T mutation at position 170 and a G→A mutation at position 221, which encode a proline to leucine (both hydrophobic amino acids) and arginine (positively charged and hydrophilic) to glutamine (hydrophilic) amino acids changes, respectively (figure S3). These mutations were based on significantly elevated numbers of non-wild type nucleotides at positions 170 and 221 in four (X12, Y7, Y8 and Z5) and three pools (X12, Y3 and Z6), respectively (Fig. 3). A complete overview of the statistical analysis is provided in figure S4. Sanger sequencing confirmed the C170T mutation in one of the four M2 families located at the plate position specified by the X12, Y7, Z5 3D pool coordinates, and the G221A mutation in one of the four M2 families at the plate position defined by the X12, Y3 and Z6 coordinates (Fig. 4).


High-throughput detection of induced mutations and natural variation using KeyPoint technology.

Rigola D, van Oeveren J, Janssen A, Bonné A, Schneiders H, van der Poel HJ, van Orsouw NJ, Hogers RC, de Both MT, van Eijk MJ - PLoS ONE (2009)

Results KeyPoint analysis mutant population.The top panel shows numbers of G to A (position 221) and C to T (position 170) sequence deviations compared to the wild type sequence observed in each of the 3D pools in a subset of nucleotide positions of the SleIF4E amplicon 1. The total number of observed sequence deviations and calculated average error rates are shown at the right hand side. The bottom panel shows corresponding P values of false positives for each X, Y and Z pool. Total numbers of pools surpassing significance thresholds P<0.001, P<0.01 and P<0.05 are shown at the bottom right. The complete analysis of all nucleotide positions are shown in figure S4.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2654077&req=5

pone-0004761-g003: Results KeyPoint analysis mutant population.The top panel shows numbers of G to A (position 221) and C to T (position 170) sequence deviations compared to the wild type sequence observed in each of the 3D pools in a subset of nucleotide positions of the SleIF4E amplicon 1. The total number of observed sequence deviations and calculated average error rates are shown at the right hand side. The bottom panel shows corresponding P values of false positives for each X, Y and Z pool. Total numbers of pools surpassing significance thresholds P<0.001, P<0.01 and P<0.05 are shown at the bottom right. The complete analysis of all nucleotide positions are shown in figure S4.
Mentions: A total of 15,000 M2 plants representing 3008 M2 families were screened for EMS-induced mutations in exon 1 of the SleIF4E gene based on amplification from 28 3D pools (12X, 8Y and 8Z). A total of 667,864 high-quality sequence reads with an average read length of 254 bases were obtained from a single GS FLX run. In the pre-processing step (Fig. 2), sample identification tags could be assigned with confidence to a total of 580,471 reads (87%, Table 1). The remaining 13% of reads contained one or more deviations in the sample identification tag sequences, contained concatamers or were reads shorter than 100 bases, and were excluded from further analysis. Successfully trimmed and tagged reads were taken into the mutation/polymorphism mining step starting with mapping them onto the reference sequence and followed by generating pair-wise sequence alignments (Fig. 2). Next, the numbers of C→T and G→A changes from the wild type sequence were counted for each position per pool (Fig. 3) and the probabilities that they represent true EMS mutations were calculated taking into account their distribution across the 3D sample pools. At significance threshold P<0.01, two mutations were identified: a C→T mutation at position 170 and a G→A mutation at position 221, which encode a proline to leucine (both hydrophobic amino acids) and arginine (positively charged and hydrophilic) to glutamine (hydrophilic) amino acids changes, respectively (figure S3). These mutations were based on significantly elevated numbers of non-wild type nucleotides at positions 170 and 221 in four (X12, Y7, Y8 and Z5) and three pools (X12, Y3 and Z6), respectively (Fig. 3). A complete overview of the statistical analysis is provided in figure S4. Sanger sequencing confirmed the C170T mutation in one of the four M2 families located at the plate position specified by the X12, Y7, Z5 3D pool coordinates, and the G221A mutation in one of the four M2 families at the plate position defined by the X12, Y3 and Z6 coordinates (Fig. 4).

Bottom Line: We present KeyPoint technology, a high-throughput mutation/polymorphism discovery technique based on massive parallel sequencing of target genes amplified from mutant or natural populations.We show the power of KeyPoint by identifying two mutants in the tomato eIF4E gene based on screening more than 3000 M2 families in a single GS FLX sequencing run, and discovery of six haplotypes of tomato eIF4E gene by re-sequencing three amplicons in a subset of 92 tomato lines from the EU-SOL core collection.We propose KeyPoint technology as a broadly applicable amplicon sequencing approach to screen mutant populations or germplasm collections for identification of (novel) allelic variation in a high-throughput fashion.

View Article: PubMed Central - PubMed

Affiliation: Keygene NV, Wageningen, The Netherlands. diana.rigola@keygene.com

ABSTRACT
Reverse genetics approaches rely on the detection of sequence alterations in target genes to identify allelic variants among mutant or natural populations. Current (pre-) screening methods such as TILLING and EcoTILLING are based on the detection of single base mismatches in heteroduplexes using endonucleases such as CEL 1. However, there are drawbacks in the use of endonucleases due to their relatively poor cleavage efficiency and exonuclease activity. Moreover, pre-screening methods do not reveal information about the nature of sequence changes and their possible impact on gene function. We present KeyPoint technology, a high-throughput mutation/polymorphism discovery technique based on massive parallel sequencing of target genes amplified from mutant or natural populations. KeyPoint combines multi-dimensional pooling of large numbers of individual DNA samples and the use of sample identification tags ("sample barcoding") with next-generation sequencing technology. We show the power of KeyPoint by identifying two mutants in the tomato eIF4E gene based on screening more than 3000 M2 families in a single GS FLX sequencing run, and discovery of six haplotypes of tomato eIF4E gene by re-sequencing three amplicons in a subset of 92 tomato lines from the EU-SOL core collection. We propose KeyPoint technology as a broadly applicable amplicon sequencing approach to screen mutant populations or germplasm collections for identification of (novel) allelic variation in a high-throughput fashion.

Show MeSH
Related in: MedlinePlus