Limits...
Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing.

Hansey CN, Vaillancourt B, Sekhon RS, de Leon N, Kaeppler SM, Buell CR - PLoS ONE (2012)

Bottom Line: However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines.De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines.RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts.

View Article: PubMed Central - PubMed

Affiliation: Department of Plant Biology, Michigan State University, East Lansing, Michigan, United States of America.

ABSTRACT
Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.

Show MeSH

Related in: MedlinePlus

Distribution of the number of single nucleotide polymorphisms (SNPs) and SNP density per gene.Reads were mapped against the 5b pseudomolecules (http://ftp.maizesequence.org/) with Bowtie version 0.12.7 [50] and TopHat version 1.2.0 [51] requiring a unique hit for the SNP mapping. Gene assignment was determined based on the 5b annotation (http://ftp.maizesequence.org/), and not all SNPs identified were assigned to a gene model. (A) Distribution of the number of SNPs per gene. (B) Distribution of the average number of SNPs per 100 bp window per gene.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3306378&req=5

pone-0033071-g001: Distribution of the number of single nucleotide polymorphisms (SNPs) and SNP density per gene.Reads were mapped against the 5b pseudomolecules (http://ftp.maizesequence.org/) with Bowtie version 0.12.7 [50] and TopHat version 1.2.0 [51] requiring a unique hit for the SNP mapping. Gene assignment was determined based on the 5b annotation (http://ftp.maizesequence.org/), and not all SNPs identified were assigned to a gene model. (A) Distribution of the number of SNPs per gene. (B) Distribution of the average number of SNPs per 100 bp window per gene.

Mentions: The 350,710 SNPs identified in this study were distributed throughout the genome, with the number of SNPs per 1 Mb window coincidental with gene density and number of expressed genes per window. There were, however, windows with relatively high or low SNP density compared to the number of genes and expressed genes, such as on the long arm of chromosome 2 (Figure S2). On a single gene basis, RNA-seq-derived SNPs ranged from zero SNPs to a maximum of 170 SNPs per gene, with 22,831 genes having at least one SNP (Figure 1A).


Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing.

Hansey CN, Vaillancourt B, Sekhon RS, de Leon N, Kaeppler SM, Buell CR - PLoS ONE (2012)

Distribution of the number of single nucleotide polymorphisms (SNPs) and SNP density per gene.Reads were mapped against the 5b pseudomolecules (http://ftp.maizesequence.org/) with Bowtie version 0.12.7 [50] and TopHat version 1.2.0 [51] requiring a unique hit for the SNP mapping. Gene assignment was determined based on the 5b annotation (http://ftp.maizesequence.org/), and not all SNPs identified were assigned to a gene model. (A) Distribution of the number of SNPs per gene. (B) Distribution of the average number of SNPs per 100 bp window per gene.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3306378&req=5

pone-0033071-g001: Distribution of the number of single nucleotide polymorphisms (SNPs) and SNP density per gene.Reads were mapped against the 5b pseudomolecules (http://ftp.maizesequence.org/) with Bowtie version 0.12.7 [50] and TopHat version 1.2.0 [51] requiring a unique hit for the SNP mapping. Gene assignment was determined based on the 5b annotation (http://ftp.maizesequence.org/), and not all SNPs identified were assigned to a gene model. (A) Distribution of the number of SNPs per gene. (B) Distribution of the average number of SNPs per 100 bp window per gene.
Mentions: The 350,710 SNPs identified in this study were distributed throughout the genome, with the number of SNPs per 1 Mb window coincidental with gene density and number of expressed genes per window. There were, however, windows with relatively high or low SNP density compared to the number of genes and expressed genes, such as on the long arm of chromosome 2 (Figure S2). On a single gene basis, RNA-seq-derived SNPs ranged from zero SNPs to a maximum of 170 SNPs per gene, with 22,831 genes having at least one SNP (Figure 1A).

Bottom Line: However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines.De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines.RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts.

View Article: PubMed Central - PubMed

Affiliation: Department of Plant Biology, Michigan State University, East Lansing, Michigan, United States of America.

ABSTRACT
Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.

Show MeSH
Related in: MedlinePlus