Limits...
Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

Muraya MM, Schmutzer T, Ulpinnis C, Scholz U, Altmann T - PLoS ONE (2015)

Bottom Line: We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions.Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation.In summary, we demonstrated that sequencing of captured DNA is a powerful approach for variant discovery in maize genes.

View Article: PubMed Central - PubMed

Affiliation: Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstra├če 3, D-06466, Stadt Seeland, Germany; Department of Plant Science, Chuka University, P.O. Box, 109-60400, Chuka, Kenya.

ABSTRACT
A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS) technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS), assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents). Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs), of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV) of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful approach for variant discovery in maize genes.

No MeSH data available.


Related in: MedlinePlus

Evaluation of read alignment and variant calling methods.(A) Comprehensive illustration of 441 evaluated read alignment results. Each method is referenced in standard, and in two additional, parameter settings. The plots show the number of aligned reads, where the range for each bar illustrates the observed variability when different lines were used. (B) Heat map depicts the true positive sites in the 50k array. A total of 504 combinations of read alignment and variant calling methods were evaluated to identify recommended or less optimal applications (genotype NC358). (C) Variant caller performance compared to the 50k array. The total number of identified SNPs, as well as the number of unique SNPs, is depicted for each of the eight evaluated methods (genotype NC358).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4495061&req=5

pone.0132120.g001: Evaluation of read alignment and variant calling methods.(A) Comprehensive illustration of 441 evaluated read alignment results. Each method is referenced in standard, and in two additional, parameter settings. The plots show the number of aligned reads, where the range for each bar illustrates the observed variability when different lines were used. (B) Heat map depicts the true positive sites in the 50k array. A total of 504 combinations of read alignment and variant calling methods were evaluated to identify recommended or less optimal applications (genotype NC358). (C) Variant caller performance compared to the 50k array. The total number of identified SNPs, as well as the number of unique SNPs, is depicted for each of the eight evaluated methods (genotype NC358).

Mentions: The combined set of reads was aligned to the B73 reference genome (AGV3) using seven different read alignment tools (Table 2, see Materials and Methods), and the total fraction of mapped quality trimmed reads ranged from 95.66% (Bowtie2) to 99.81% (Stampy). On average 98.7% of the reads were aligned to the B73 reference. On average, 41.32% of the quality trimmed reads were mapped to the target sequence, ranging from 35.57% (Bowtie2) to 41.81% (BWA-MEM). However, the mapped sequence depth variation was minimal for all aligners (Table 2). In addition, an extended series of 441 independent read alignments was performed for all 21 inbred lines, using three different parameter settings and was evaluated to reveal the most reliable setup (Fig 1A, see Materials and Methods). The agreement between the different read alignment methods was analyzed by calculating those reads that were mapped by the majority of tools, as well as the number of reads mapped by at least one other alignment method (S1 Fig). For all seven evaluated tools the constructed read alignments reached high quality.


Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

Muraya MM, Schmutzer T, Ulpinnis C, Scholz U, Altmann T - PLoS ONE (2015)

Evaluation of read alignment and variant calling methods.(A) Comprehensive illustration of 441 evaluated read alignment results. Each method is referenced in standard, and in two additional, parameter settings. The plots show the number of aligned reads, where the range for each bar illustrates the observed variability when different lines were used. (B) Heat map depicts the true positive sites in the 50k array. A total of 504 combinations of read alignment and variant calling methods were evaluated to identify recommended or less optimal applications (genotype NC358). (C) Variant caller performance compared to the 50k array. The total number of identified SNPs, as well as the number of unique SNPs, is depicted for each of the eight evaluated methods (genotype NC358).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4495061&req=5

pone.0132120.g001: Evaluation of read alignment and variant calling methods.(A) Comprehensive illustration of 441 evaluated read alignment results. Each method is referenced in standard, and in two additional, parameter settings. The plots show the number of aligned reads, where the range for each bar illustrates the observed variability when different lines were used. (B) Heat map depicts the true positive sites in the 50k array. A total of 504 combinations of read alignment and variant calling methods were evaluated to identify recommended or less optimal applications (genotype NC358). (C) Variant caller performance compared to the 50k array. The total number of identified SNPs, as well as the number of unique SNPs, is depicted for each of the eight evaluated methods (genotype NC358).
Mentions: The combined set of reads was aligned to the B73 reference genome (AGV3) using seven different read alignment tools (Table 2, see Materials and Methods), and the total fraction of mapped quality trimmed reads ranged from 95.66% (Bowtie2) to 99.81% (Stampy). On average 98.7% of the reads were aligned to the B73 reference. On average, 41.32% of the quality trimmed reads were mapped to the target sequence, ranging from 35.57% (Bowtie2) to 41.81% (BWA-MEM). However, the mapped sequence depth variation was minimal for all aligners (Table 2). In addition, an extended series of 441 independent read alignments was performed for all 21 inbred lines, using three different parameter settings and was evaluated to reveal the most reliable setup (Fig 1A, see Materials and Methods). The agreement between the different read alignment methods was analyzed by calculating those reads that were mapped by the majority of tools, as well as the number of reads mapped by at least one other alignment method (S1 Fig). For all seven evaluated tools the constructed read alignments reached high quality.

Bottom Line: We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions.Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation.In summary, we demonstrated that sequencing of captured DNA is a powerful approach for variant discovery in maize genes.

View Article: PubMed Central - PubMed

Affiliation: Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstra├če 3, D-06466, Stadt Seeland, Germany; Department of Plant Science, Chuka University, P.O. Box, 109-60400, Chuka, Kenya.

ABSTRACT
A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS) technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS), assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents). Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs), of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV) of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful approach for variant discovery in maize genes.

No MeSH data available.


Related in: MedlinePlus