Limits...
De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L.).

Fu N, Wang Q, Shen HL - PLoS ONE (2013)

Bottom Line: Large numbers of simple sequence repeats (SSRs) were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions.Our results provide a valuable resource for celery research.The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

View Article: PubMed Central - PubMed

Affiliation: College of Agriculture and Biotechnology, China Agricultural University, Beijing, China.

ABSTRACT

Background: Celery is an increasing popular vegetable species, but limited transcriptome and genomic data hinder the research to it. In addition, a lack of celery molecular markers limits the process of molecular genetic breeding. High-throughput transcriptome sequencing is an efficient method to generate a large transcriptome sequence dataset for gene discovery, molecular marker development and marker-assisted selection breeding.

Principal findings: Celery transcriptomes from four tissues were sequenced using Illumina paired-end sequencing technology. De novo assembling was performed to generate a collection of 42,280 unigenes (average length of 502.6 bp) that represent the first transcriptome of the species. 78.43% and 48.93% of the unigenes had significant similarity with proteins in the National Center for Biotechnology Information (NCBI) non-redundant protein database (Nr) and Swiss-Prot database respectively, and 10,473 (24.77%) unigenes were assigned to Clusters of Orthologous Groups (COG). 21,126 (49.97%) unigenes harboring Interpro domains were annotated, in which 15,409 (36.45%) were assigned to Gene Ontology(GO) categories. Additionally, 7,478 unigenes were mapped onto 228 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). Large numbers of simple sequence repeats (SSRs) were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions.

Conclusions: This study demonstrates the feasibility of generating a large scale of sequence information by Illumina paired-end sequencing and efficient assembling. Our results provide a valuable resource for celery research. The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

Show MeSH
Gene Ontology classifications of assembled unigenes.The unigenes are summarized into three main categories: biological process, cellular location, and molecular function. In total, 15,409 unigenes with BLASTx matches were assigned to gene ontologies.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3585167&req=5

pone-0057686-g004: Gene Ontology classifications of assembled unigenes.The unigenes are summarized into three main categories: biological process, cellular location, and molecular function. In total, 15,409 unigenes with BLASTx matches were assigned to gene ontologies.

Mentions: Gene Ontology (GO) provides ontologies of defined terms representing gene product properties, and it has ontologies that describe gene products in terms of their associated biological processes, cellular components and molecular functions. In this study, 15,409 unigenes could be assigned to one or more ontologies and we assigned each unigene to a set of GO Slims. A summary with unigenes classified to each GO slim term is shown in Figure 4. Totally, 5,535 unigenes were grouped under cellular components, 13,934 unigenes under molecular functions and 11,210 unigenes under biological processes. Metabolic process (7,908 unigenes, 70.54%) and cellular process (7,024 unigenes, 62.66%) were the most highly represented groups under the biological process category. Genes involved in other important biological processes such as biological regulation, stimulus response, reproduction and developmental process were also identified. Furthermore, we also found that a relatively large numbers of unigenes (11.57%) were involved in the metabolism of pigmentation, which may play a role in the petiole color formation. For the cellular components category, cell and cell part were the most highly represented groups. Regarding molecular functions, binding and catalytic activity represented the majorities of the category with a large number of ligases, transferases, hydrolases, oxidoreductases annotated.


De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L.).

Fu N, Wang Q, Shen HL - PLoS ONE (2013)

Gene Ontology classifications of assembled unigenes.The unigenes are summarized into three main categories: biological process, cellular location, and molecular function. In total, 15,409 unigenes with BLASTx matches were assigned to gene ontologies.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3585167&req=5

pone-0057686-g004: Gene Ontology classifications of assembled unigenes.The unigenes are summarized into three main categories: biological process, cellular location, and molecular function. In total, 15,409 unigenes with BLASTx matches were assigned to gene ontologies.
Mentions: Gene Ontology (GO) provides ontologies of defined terms representing gene product properties, and it has ontologies that describe gene products in terms of their associated biological processes, cellular components and molecular functions. In this study, 15,409 unigenes could be assigned to one or more ontologies and we assigned each unigene to a set of GO Slims. A summary with unigenes classified to each GO slim term is shown in Figure 4. Totally, 5,535 unigenes were grouped under cellular components, 13,934 unigenes under molecular functions and 11,210 unigenes under biological processes. Metabolic process (7,908 unigenes, 70.54%) and cellular process (7,024 unigenes, 62.66%) were the most highly represented groups under the biological process category. Genes involved in other important biological processes such as biological regulation, stimulus response, reproduction and developmental process were also identified. Furthermore, we also found that a relatively large numbers of unigenes (11.57%) were involved in the metabolism of pigmentation, which may play a role in the petiole color formation. For the cellular components category, cell and cell part were the most highly represented groups. Regarding molecular functions, binding and catalytic activity represented the majorities of the category with a large number of ligases, transferases, hydrolases, oxidoreductases annotated.

Bottom Line: Large numbers of simple sequence repeats (SSRs) were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions.Our results provide a valuable resource for celery research.The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

View Article: PubMed Central - PubMed

Affiliation: College of Agriculture and Biotechnology, China Agricultural University, Beijing, China.

ABSTRACT

Background: Celery is an increasing popular vegetable species, but limited transcriptome and genomic data hinder the research to it. In addition, a lack of celery molecular markers limits the process of molecular genetic breeding. High-throughput transcriptome sequencing is an efficient method to generate a large transcriptome sequence dataset for gene discovery, molecular marker development and marker-assisted selection breeding.

Principal findings: Celery transcriptomes from four tissues were sequenced using Illumina paired-end sequencing technology. De novo assembling was performed to generate a collection of 42,280 unigenes (average length of 502.6 bp) that represent the first transcriptome of the species. 78.43% and 48.93% of the unigenes had significant similarity with proteins in the National Center for Biotechnology Information (NCBI) non-redundant protein database (Nr) and Swiss-Prot database respectively, and 10,473 (24.77%) unigenes were assigned to Clusters of Orthologous Groups (COG). 21,126 (49.97%) unigenes harboring Interpro domains were annotated, in which 15,409 (36.45%) were assigned to Gene Ontology(GO) categories. Additionally, 7,478 unigenes were mapped onto 228 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). Large numbers of simple sequence repeats (SSRs) were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions.

Conclusions: This study demonstrates the feasibility of generating a large scale of sequence information by Illumina paired-end sequencing and efficient assembling. Our results provide a valuable resource for celery research. The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

Show MeSH