Limits...
Transcriptome sequencing for SNP discovery across Cucumis melo.

Blanca J, Esteras C, Ziarsolo P, Pérez D, Fernã Ndez-Pedrosa V, Collado C, Rodrã Guez de Pablos R, Ballester A, Roig C, Cañizares J, Picó B - BMC Genomics (2012)

Bottom Line: The number and variability of in silico SNVs differed considerably between pools.This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity.This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for the Conservation and Breeding of Agricultural Biodiversity (COMAV-UPV), Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain.

ABSTRACT

Background: Melon (Cucumis melo L.) is a highly diverse species that is cultivated worldwide. Recent advances in massively parallel sequencing have begun to allow the study of nucleotide diversity in this species. The Sanger method combined with medium-throughput 454 technology were used in a previous study to analyze the genetic diversity of germplasm representing 3 botanical varieties, yielding a collection of about 40,000 SNPs distributed in 14,000 unigenes. However, the usefulness of this resource is limited as the sequenced genotypes do not represent the whole diversity of the species, which is divided into two subspecies with many botanical varieties variable in plant, flowering, and fruit traits, as well as in stress response. As a first step to extensively document levels and patterns of nucleotide variability across the species, we used the high-throughput SOLiD™ system to resequence the transcriptomes of a set of 67 genotypes that had previously been selected from a core collection representing the extant variation of the entire species.

Results: The deep transcriptome resequencing of all of the genotypes, grouped into 8 pools (wild African agrestis, Asian agrestis and acidulus, exotic Far Eastern conomon, Indian momordica and Asian dudaim and flexuosus, commercial cantalupensis, subsp. melo Asian and European landraces, Spanish inodorus landraces, and Piel de Sapo breeding lines) yielded about 300 M reads. Short reads were mapped to the recently generated draft genome assembly of the DHL line Piel de Sapo (inodorus) x Songwhan Charmi (conomon) and to a new version of melon transcriptome. Regions with at least 6X coverage were used in SNV calling, generating a melon collection with 303,883 variants. These SNVs were dispersed across the entire C. melo genome, and distributed in 15,064 annotated genes. The number and variability of in silico SNVs differed considerably between pools. Our finding of higher genomic diversity in wild and exotic agrestis melons from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes.

Conclusions: This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties.

Show MeSH

Related in: MedlinePlus

SNPs detected in the coding region of Cm-ACO-1 (A) and Cm-eiF(iso)4E (B).  Short reads generated by SOLiD in the different pools are represented mapped to the genomic sequence (whole genome draft version 3.5 available in MELONOMICS) of both genes. Coverage in exonic and UTRs regions is shown for each nucleotide. SNPs detected by SOLiD and EcoTILLING are represented by colored bars in the different exons (red, green and yellow for mutations detected only by SOLiD, only by EcoTILLING and by both methods). The structure of each gene as annotated in the genome is shown below. Data are visualized with IGV (Integrative Genomics Viewer)[65].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3473316&req=5

Figure 3: SNPs detected in the coding region of Cm-ACO-1 (A) and Cm-eiF(iso)4E (B). Short reads generated by SOLiD in the different pools are represented mapped to the genomic sequence (whole genome draft version 3.5 available in MELONOMICS) of both genes. Coverage in exonic and UTRs regions is shown for each nucleotide. SNPs detected by SOLiD and EcoTILLING are represented by colored bars in the different exons (red, green and yellow for mutations detected only by SOLiD, only by EcoTILLING and by both methods). The structure of each gene as annotated in the genome is shown below. Data are visualized with IGV (Integrative Genomics Viewer)[65].

Mentions: Cm-ACO-1 (unigene MELO3C014437 at[36]) is located in positions 3015704–3017224 of the scaffold CM3.5_scaffold00022 in the melon genome (v3.5) (Figure3 A). Resequencing permitted us to find 6 SNPs in the coding region of this gene (Table4). Five nucleotide variants were also previously detected by EcoTILLING[59]. The allele distribution found in SOLiD agrees with the EcoTILLING haplotypes: two mutations were exclusive to the agrestis pools (1, 2, and 3) (CM3.5_scaffold00022: 3015744 and 3016016), one was exclusive to the conomon pool (3) (CM3.5_scaffold00022: 3016091), and one was fixed in agrestis and appeared with a low frequency in the momordica and melo pools (4, 5, 6, 7 and 8) (CM3.5_scaffold00022: 3015944). According to EcoTILLING, the mutation CM3.5_scaffold00022: 3016304, the only predicted not to be tolerated by SIFT, was present in only one genotype, the snake melon from Arabia (included in pool 4, Table1). Accordingly, the variant was only sequenced in pool 4, thus confirming the utility of pooling samples to increase the number of genotypes represented in resequencing assays without missing rare alleles.


Transcriptome sequencing for SNP discovery across Cucumis melo.

Blanca J, Esteras C, Ziarsolo P, Pérez D, Fernã Ndez-Pedrosa V, Collado C, Rodrã Guez de Pablos R, Ballester A, Roig C, Cañizares J, Picó B - BMC Genomics (2012)

SNPs detected in the coding region of Cm-ACO-1 (A) and Cm-eiF(iso)4E (B).  Short reads generated by SOLiD in the different pools are represented mapped to the genomic sequence (whole genome draft version 3.5 available in MELONOMICS) of both genes. Coverage in exonic and UTRs regions is shown for each nucleotide. SNPs detected by SOLiD and EcoTILLING are represented by colored bars in the different exons (red, green and yellow for mutations detected only by SOLiD, only by EcoTILLING and by both methods). The structure of each gene as annotated in the genome is shown below. Data are visualized with IGV (Integrative Genomics Viewer)[65].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3473316&req=5

Figure 3: SNPs detected in the coding region of Cm-ACO-1 (A) and Cm-eiF(iso)4E (B). Short reads generated by SOLiD in the different pools are represented mapped to the genomic sequence (whole genome draft version 3.5 available in MELONOMICS) of both genes. Coverage in exonic and UTRs regions is shown for each nucleotide. SNPs detected by SOLiD and EcoTILLING are represented by colored bars in the different exons (red, green and yellow for mutations detected only by SOLiD, only by EcoTILLING and by both methods). The structure of each gene as annotated in the genome is shown below. Data are visualized with IGV (Integrative Genomics Viewer)[65].
Mentions: Cm-ACO-1 (unigene MELO3C014437 at[36]) is located in positions 3015704–3017224 of the scaffold CM3.5_scaffold00022 in the melon genome (v3.5) (Figure3 A). Resequencing permitted us to find 6 SNPs in the coding region of this gene (Table4). Five nucleotide variants were also previously detected by EcoTILLING[59]. The allele distribution found in SOLiD agrees with the EcoTILLING haplotypes: two mutations were exclusive to the agrestis pools (1, 2, and 3) (CM3.5_scaffold00022: 3015744 and 3016016), one was exclusive to the conomon pool (3) (CM3.5_scaffold00022: 3016091), and one was fixed in agrestis and appeared with a low frequency in the momordica and melo pools (4, 5, 6, 7 and 8) (CM3.5_scaffold00022: 3015944). According to EcoTILLING, the mutation CM3.5_scaffold00022: 3016304, the only predicted not to be tolerated by SIFT, was present in only one genotype, the snake melon from Arabia (included in pool 4, Table1). Accordingly, the variant was only sequenced in pool 4, thus confirming the utility of pooling samples to increase the number of genotypes represented in resequencing assays without missing rare alleles.

Bottom Line: The number and variability of in silico SNVs differed considerably between pools.This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity.This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for the Conservation and Breeding of Agricultural Biodiversity (COMAV-UPV), Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain.

ABSTRACT

Background: Melon (Cucumis melo L.) is a highly diverse species that is cultivated worldwide. Recent advances in massively parallel sequencing have begun to allow the study of nucleotide diversity in this species. The Sanger method combined with medium-throughput 454 technology were used in a previous study to analyze the genetic diversity of germplasm representing 3 botanical varieties, yielding a collection of about 40,000 SNPs distributed in 14,000 unigenes. However, the usefulness of this resource is limited as the sequenced genotypes do not represent the whole diversity of the species, which is divided into two subspecies with many botanical varieties variable in plant, flowering, and fruit traits, as well as in stress response. As a first step to extensively document levels and patterns of nucleotide variability across the species, we used the high-throughput SOLiD™ system to resequence the transcriptomes of a set of 67 genotypes that had previously been selected from a core collection representing the extant variation of the entire species.

Results: The deep transcriptome resequencing of all of the genotypes, grouped into 8 pools (wild African agrestis, Asian agrestis and acidulus, exotic Far Eastern conomon, Indian momordica and Asian dudaim and flexuosus, commercial cantalupensis, subsp. melo Asian and European landraces, Spanish inodorus landraces, and Piel de Sapo breeding lines) yielded about 300 M reads. Short reads were mapped to the recently generated draft genome assembly of the DHL line Piel de Sapo (inodorus) x Songwhan Charmi (conomon) and to a new version of melon transcriptome. Regions with at least 6X coverage were used in SNV calling, generating a melon collection with 303,883 variants. These SNVs were dispersed across the entire C. melo genome, and distributed in 15,064 annotated genes. The number and variability of in silico SNVs differed considerably between pools. Our finding of higher genomic diversity in wild and exotic agrestis melons from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes.

Conclusions: This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties.

Show MeSH
Related in: MedlinePlus