Limits...
De novo sequencing and analysis of the cranberry fruit transcriptome to identify putative genes involved in flavonoid biosynthesis, transport and regulation.

Sun H, Liu Y, Gai Y, Geng J, Chen L, Liu H, Kang L, Tian Y, Li Y - BMC Genomics (2015)

Bottom Line: In addition, 14,473 simple sequence repeats (SSRs) were detected.Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries.Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.

View Article: PubMed Central - PubMed

Affiliation: College of Horticulture, Jilin Agricultural University, Changchun, Jilin, China. haiyue-sun@hotmail.com.

ABSTRACT

Background: Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry.

Results: In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected.

Conclusions: Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.

No MeSH data available.


Length distribution of All-unigenes in cranberry fruit transcriptome
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4556307&req=5

Fig1: Length distribution of All-unigenes in cranberry fruit transcriptome

Mentions: To obtain a complete profile of the cranberry transcriptome during fruit development, two cDNA libraries were built for two fruit developmental stages: white fruit (W) and red fruit (R). 59,986,374 and 59,690,570 raw reads were generated from the W and R libraries, respectively. A summary of these sequencing results are presented in Table 1. After removing low quality short sequences, 52,413,112 and 53,352,808 clean reads for W and R, respectively, remained and were used for assembly. The Q20 percentages (sequencing error rate <1 %) and GC percentages obtained from the W and R libraries were 97.86 % and 46.93 %, and 97.93 % and 47.24 %, respectively. These results suggest that the sequencing data have sufficient quantity and quality to ensure accurate sequence assembly and adequate transcriptome coverage. Cleaned reads from each library were assembled independently using Trinity tool. Inchworm assembly of reads, the first step of Trinity, resulted in 105,377 and 114,265 contigs with mean sizes of 359 bp and 367 bp, and N50s of 739 bp and 783 bp, for W and R, respectively. After clustering with the TGICL software [33], the contigs were assembled into 69,540 unigenes for W with a mean length of 510 bp and an N50 of 763 bp, and 66,917 unigenes for R with a mean length of 597 bp and an N50 of 1,020 bp. At last, these two sets of unigenes were merged with TGICL resulting in a final assembly of 57,331 All-unigenes (with a total length of ~42 Mb), with a mean size of 739 bp and an N50 of 1,209 bp (Additional file 1). The N50 value is one of the most popular metrics to assess assembly quality, which reflects a continuous and complete assembly [34]. The N50 value of the cranberry fruit transcriptome is longer than that reported in previous studies on blueberry fruit transcriptomes (1,100 bp) [35]. The unigene size distribution showed the following: 76.1 % (43,643) of the unigenes were between 300 and 1,000 bp in length; 22.2 % (12,720) of the unigenes were between 1,000 and 3,000 bp; and 1.7 % (969) were more than 3,000 bp long (Fig. 1). The assembled unigenes were also mapped to the cranberry genome to examine the accuracy of the transcriptomes. 47,316 (82.5 %) unigenes were mapped to the cranberry genome sequence using the NCBI Mega BLAST program [36]. The remaining unigenes that did not map to the cranberry genome may be owing to differences between cultivars, gaps in the genome sequence or too short exons.Table 1


De novo sequencing and analysis of the cranberry fruit transcriptome to identify putative genes involved in flavonoid biosynthesis, transport and regulation.

Sun H, Liu Y, Gai Y, Geng J, Chen L, Liu H, Kang L, Tian Y, Li Y - BMC Genomics (2015)

Length distribution of All-unigenes in cranberry fruit transcriptome
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4556307&req=5

Fig1: Length distribution of All-unigenes in cranberry fruit transcriptome
Mentions: To obtain a complete profile of the cranberry transcriptome during fruit development, two cDNA libraries were built for two fruit developmental stages: white fruit (W) and red fruit (R). 59,986,374 and 59,690,570 raw reads were generated from the W and R libraries, respectively. A summary of these sequencing results are presented in Table 1. After removing low quality short sequences, 52,413,112 and 53,352,808 clean reads for W and R, respectively, remained and were used for assembly. The Q20 percentages (sequencing error rate <1 %) and GC percentages obtained from the W and R libraries were 97.86 % and 46.93 %, and 97.93 % and 47.24 %, respectively. These results suggest that the sequencing data have sufficient quantity and quality to ensure accurate sequence assembly and adequate transcriptome coverage. Cleaned reads from each library were assembled independently using Trinity tool. Inchworm assembly of reads, the first step of Trinity, resulted in 105,377 and 114,265 contigs with mean sizes of 359 bp and 367 bp, and N50s of 739 bp and 783 bp, for W and R, respectively. After clustering with the TGICL software [33], the contigs were assembled into 69,540 unigenes for W with a mean length of 510 bp and an N50 of 763 bp, and 66,917 unigenes for R with a mean length of 597 bp and an N50 of 1,020 bp. At last, these two sets of unigenes were merged with TGICL resulting in a final assembly of 57,331 All-unigenes (with a total length of ~42 Mb), with a mean size of 739 bp and an N50 of 1,209 bp (Additional file 1). The N50 value is one of the most popular metrics to assess assembly quality, which reflects a continuous and complete assembly [34]. The N50 value of the cranberry fruit transcriptome is longer than that reported in previous studies on blueberry fruit transcriptomes (1,100 bp) [35]. The unigene size distribution showed the following: 76.1 % (43,643) of the unigenes were between 300 and 1,000 bp in length; 22.2 % (12,720) of the unigenes were between 1,000 and 3,000 bp; and 1.7 % (969) were more than 3,000 bp long (Fig. 1). The assembled unigenes were also mapped to the cranberry genome to examine the accuracy of the transcriptomes. 47,316 (82.5 %) unigenes were mapped to the cranberry genome sequence using the NCBI Mega BLAST program [36]. The remaining unigenes that did not map to the cranberry genome may be owing to differences between cultivars, gaps in the genome sequence or too short exons.Table 1

Bottom Line: In addition, 14,473 simple sequence repeats (SSRs) were detected.Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries.Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.

View Article: PubMed Central - PubMed

Affiliation: College of Horticulture, Jilin Agricultural University, Changchun, Jilin, China. haiyue-sun@hotmail.com.

ABSTRACT

Background: Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry.

Results: In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected.

Conclusions: Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.

No MeSH data available.