Limits...
A deep transcriptomic analysis of pod development in the vanilla orchid (Vanilla planifolia).

Rao X, Krom N, Tang Y, Widiez T, Havkin-Frenkel D, Belanger FC, Dixon RA, Chen F - BMC Genomics (2014)

Bottom Line: The combined 454/Illumina RNA-seq platforms provide both deep sequence coverage and high quality de novo transcriptome assembly for this non-model crop species.The annotated sequence data provide a foundation for understanding multiple aspects of the biochemistry and development of the vanilla bean, as exemplified by the identification of candidate genes involved in lignin biosynthesis.This database provides a general resource for further studies on this important flavor species.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, University of North Texas, 1155 Union Circle #305220, Denton, TX 76203, USA. Xiaolan.Rao@unt.edu.

ABSTRACT

Background: Pods of the vanilla orchid (Vanilla planifolia) accumulate large amounts of the flavor compound vanillin (3-methoxy, 4-hydroxy-benzaldehyde) as a glucoside during the later stages of their development. At earlier stages, the developing seeds within the pod synthesize a novel lignin polymer, catechyl (C) lignin, in their coats. Genomic resources for determining the biosynthetic routes to these compounds and other flavor components in V. planifolia are currently limited.

Results: Using next-generation sequencing technologies, we have generated very large gene sequence datasets from vanilla pods at different times of development, and representing different tissue types, including the seeds, hairs, placental and mesocarp tissues. This developmental series was chosen as being the most informative for interrogation of pathways of vanillin and C-lignin biosynthesis in the pod and seed, respectively. The combined 454/Illumina RNA-seq platforms provide both deep sequence coverage and high quality de novo transcriptome assembly for this non-model crop species.

Conclusions: The annotated sequence data provide a foundation for understanding multiple aspects of the biochemistry and development of the vanilla bean, as exemplified by the identification of candidate genes involved in lignin biosynthesis. Our transcriptome data indicate that C-lignin formation in the seed coat involves coordinate expression of monolignol biosynthetic genes with the exception of those encoding the caffeoyl coenzyme A 3-O-methyltransferase for conversion of caffeoyl to feruloyl moieties. This database provides a general resource for further studies on this important flavor species.

Show MeSH
Length distribution of the hybrid-assembled transcripts. (A) Length frequency distribution of the 301,459 transcripts, (B) Size distribution plot of the 301,459 transcripts.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4233054&req=5

Fig2: Length distribution of the hybrid-assembled transcripts. (A) Length frequency distribution of the 301,459 transcripts, (B) Size distribution plot of the 301,459 transcripts.

Mentions: An often used combination of current sequencing technologies is to mix de-novo 454 assembly and Illumina mapping assemblies; the 454 approach allows the building of long contigs, and the Illumina approach reduces problematic 454-generated homopolymer sequences [16]. Therefore, to produce high quality vanilla transcripts, we employed an optimized two-step strategy. First, a combined assembly of all 26 Illumina samples (3,439,193,362 reads total) was produced using Velvet 1.2.03 with 27 k-mer length, producing 6,371,646 contig sequences. Then, Illumina contigs and qualified 454 reads were assembled together with MIRA 3.2.1 [16, 17]. This resulted in a total of 301,459 contigs which were considered as vanilla-expressed transcripts for further annotation and analysis. The size distribution of the total 301,459 contigs is shown in Figure 2. The contig N50 is the length of the smallest contig in the set that contains the largest contigs whose combined length represents at least 50% of the contig assembly; this parameter is generally used as a standard metric for assembly size [18]. Here size distribution at N50 was 1960 bp in length and average contig size was 1256 bp. All short reads obtained in this study have been submitted to the NCBI Sequence Read Archive (SRA) [BioProject: PRJNA253813]. Accession numbers for each library are listed at “Availability of supporting data” section. All assembled data can be searched and retrieved at http://www.sc.noble.org/vanilla/blast/blast.php.Figure 2


A deep transcriptomic analysis of pod development in the vanilla orchid (Vanilla planifolia).

Rao X, Krom N, Tang Y, Widiez T, Havkin-Frenkel D, Belanger FC, Dixon RA, Chen F - BMC Genomics (2014)

Length distribution of the hybrid-assembled transcripts. (A) Length frequency distribution of the 301,459 transcripts, (B) Size distribution plot of the 301,459 transcripts.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4233054&req=5

Fig2: Length distribution of the hybrid-assembled transcripts. (A) Length frequency distribution of the 301,459 transcripts, (B) Size distribution plot of the 301,459 transcripts.
Mentions: An often used combination of current sequencing technologies is to mix de-novo 454 assembly and Illumina mapping assemblies; the 454 approach allows the building of long contigs, and the Illumina approach reduces problematic 454-generated homopolymer sequences [16]. Therefore, to produce high quality vanilla transcripts, we employed an optimized two-step strategy. First, a combined assembly of all 26 Illumina samples (3,439,193,362 reads total) was produced using Velvet 1.2.03 with 27 k-mer length, producing 6,371,646 contig sequences. Then, Illumina contigs and qualified 454 reads were assembled together with MIRA 3.2.1 [16, 17]. This resulted in a total of 301,459 contigs which were considered as vanilla-expressed transcripts for further annotation and analysis. The size distribution of the total 301,459 contigs is shown in Figure 2. The contig N50 is the length of the smallest contig in the set that contains the largest contigs whose combined length represents at least 50% of the contig assembly; this parameter is generally used as a standard metric for assembly size [18]. Here size distribution at N50 was 1960 bp in length and average contig size was 1256 bp. All short reads obtained in this study have been submitted to the NCBI Sequence Read Archive (SRA) [BioProject: PRJNA253813]. Accession numbers for each library are listed at “Availability of supporting data” section. All assembled data can be searched and retrieved at http://www.sc.noble.org/vanilla/blast/blast.php.Figure 2

Bottom Line: The combined 454/Illumina RNA-seq platforms provide both deep sequence coverage and high quality de novo transcriptome assembly for this non-model crop species.The annotated sequence data provide a foundation for understanding multiple aspects of the biochemistry and development of the vanilla bean, as exemplified by the identification of candidate genes involved in lignin biosynthesis.This database provides a general resource for further studies on this important flavor species.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, University of North Texas, 1155 Union Circle #305220, Denton, TX 76203, USA. Xiaolan.Rao@unt.edu.

ABSTRACT

Background: Pods of the vanilla orchid (Vanilla planifolia) accumulate large amounts of the flavor compound vanillin (3-methoxy, 4-hydroxy-benzaldehyde) as a glucoside during the later stages of their development. At earlier stages, the developing seeds within the pod synthesize a novel lignin polymer, catechyl (C) lignin, in their coats. Genomic resources for determining the biosynthetic routes to these compounds and other flavor components in V. planifolia are currently limited.

Results: Using next-generation sequencing technologies, we have generated very large gene sequence datasets from vanilla pods at different times of development, and representing different tissue types, including the seeds, hairs, placental and mesocarp tissues. This developmental series was chosen as being the most informative for interrogation of pathways of vanillin and C-lignin biosynthesis in the pod and seed, respectively. The combined 454/Illumina RNA-seq platforms provide both deep sequence coverage and high quality de novo transcriptome assembly for this non-model crop species.

Conclusions: The annotated sequence data provide a foundation for understanding multiple aspects of the biochemistry and development of the vanilla bean, as exemplified by the identification of candidate genes involved in lignin biosynthesis. Our transcriptome data indicate that C-lignin formation in the seed coat involves coordinate expression of monolignol biosynthetic genes with the exception of those encoding the caffeoyl coenzyme A 3-O-methyltransferase for conversion of caffeoyl to feruloyl moieties. This database provides a general resource for further studies on this important flavor species.

Show MeSH