Limits...
SNP-Seek database of SNPs derived from 3000 rice genomes.

Alexandrov N, Tai S, Wang W, Mansueto L, Palis K, Fuentes RR, Ulat VJ, Chebotarov D, Zhang G, Li Z, Mauleon R, Hamilton RS, McNally KL - Nucleic Acids Res. (2014)

Bottom Line: We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome.SNPs can be visualized together with the gene structures in JBrowse genome browser.Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots.

View Article: PubMed Central - PubMed

Affiliation: T.T.Chang Genetic Resources Center, IRRI, Los Baños, Laguna 4031, Philippines n.alexandrov@irri.org.

Show MeSH

Related in: MedlinePlus

Distribution of SNP coverage
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4383887&req=5

Figure 2: Distribution of SNP coverage

Mentions: For the SNP-Seek database we have considered only SNPs, ignoring indels. A union of all SNPs extracted from 3000 vcf files consists of 23 M SNPs. To eliminate potentially false SNPs, we have collected only SNPs that have the minor allele in at least two different varieties. The number of such SNPs is 20 M. All the genotype calls at these positions were combined into one file of ∼20 M × 3 K SNP calls, and the data were loaded into an Oracle schema using three main tables: STOCK, SNP and SNP_GENOTYPE (Figure 1). Some varieties lack reads mapping to the SNP position, and for them no SNP calls were recorded. Distribution of the SNP coverage is shown in Figure 2. About 90% of all SNP calls have a number of supporting reads greater than or equal to four. Out of them, 98% have a major allele frequency >90% and are considered to be homozygous, 1.1% have two alleles with frequencies between 40 and 60% and considered to be heterozygous, and the remaining 0.9% represent other cases when the SNP could not be classified as neither heterozygous nor homozygous. More than 98% of SNPs have exactly two different allelic variants in 3000 varieties, 1.7% of SNPs have three variants and 0.02% of SNPs have all four nucleotides in different genomes mapped to that SNP position. There are 2.3× more transitions than transvertions in our database (Table 1).


SNP-Seek database of SNPs derived from 3000 rice genomes.

Alexandrov N, Tai S, Wang W, Mansueto L, Palis K, Fuentes RR, Ulat VJ, Chebotarov D, Zhang G, Li Z, Mauleon R, Hamilton RS, McNally KL - Nucleic Acids Res. (2014)

Distribution of SNP coverage
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4383887&req=5

Figure 2: Distribution of SNP coverage
Mentions: For the SNP-Seek database we have considered only SNPs, ignoring indels. A union of all SNPs extracted from 3000 vcf files consists of 23 M SNPs. To eliminate potentially false SNPs, we have collected only SNPs that have the minor allele in at least two different varieties. The number of such SNPs is 20 M. All the genotype calls at these positions were combined into one file of ∼20 M × 3 K SNP calls, and the data were loaded into an Oracle schema using three main tables: STOCK, SNP and SNP_GENOTYPE (Figure 1). Some varieties lack reads mapping to the SNP position, and for them no SNP calls were recorded. Distribution of the SNP coverage is shown in Figure 2. About 90% of all SNP calls have a number of supporting reads greater than or equal to four. Out of them, 98% have a major allele frequency >90% and are considered to be homozygous, 1.1% have two alleles with frequencies between 40 and 60% and considered to be heterozygous, and the remaining 0.9% represent other cases when the SNP could not be classified as neither heterozygous nor homozygous. More than 98% of SNPs have exactly two different allelic variants in 3000 varieties, 1.7% of SNPs have three variants and 0.02% of SNPs have all four nucleotides in different genomes mapped to that SNP position. There are 2.3× more transitions than transvertions in our database (Table 1).

Bottom Line: We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome.SNPs can be visualized together with the gene structures in JBrowse genome browser.Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots.

View Article: PubMed Central - PubMed

Affiliation: T.T.Chang Genetic Resources Center, IRRI, Los Baños, Laguna 4031, Philippines n.alexandrov@irri.org.

Show MeSH
Related in: MedlinePlus