Limits...
Characterization of a second secologanin synthase isoform producing both secologanin and secoxyloganin allows enhanced de novo assembly of a Catharanthus roseus transcriptome.

Dugé de Bernonville T, Foureau E, Parage C, Lanoue A, Clastre M, Londono MA, Oudin A, Houillé B, Papon N, Besseau S, Glévarec G, Atehortùa L, Giglioli-Guivarc'h N, St-Pierre B, De Luca V, O'Connor SE, Courdavault V - BMC Genomics (2015)

Bottom Line: The new consensus transcriptome allowed a precise estimation of abundance of SLS and T16H isoforms, similar to qPCR measurements.The C. roseus consensus transcriptome can now be used for characterization of new genes of the MIA pathway.Furthermore, additional isoforms of genes encoding distinct MIA biosynthetic enzymes isoforms could be predicted suggesting the existence of a higher level of complexity in the synthesis of MIA, raising the question of the evolutionary events behind what seems like redundancy.

View Article: PubMed Central - PubMed

Affiliation: Université François-Rabelais de Tours, EA2106 "Biomolécules et Biotechnologies Végétales", UFR Sciences et Techniques, 37200, Tours, France. Bernonvillethomas.duge@univ-tours.fr.

ABSTRACT

Background: Transcriptome sequencing offers a great resource for the study of non-model plants such as Catharanthus roseus, which produces valuable monoterpenoid indole alkaloids (MIAs) via a complex biosynthetic pathway whose characterization is still undergoing. Transcriptome databases dedicated to this plant were recently developed by several consortia to uncover new biosynthetic genes. However, the identification of missing steps in MIA biosynthesis based on these large datasets may be limited by the erroneous assembly of close transcripts and isoforms, even with the multiple available transcriptomes.

Results: Secologanin synthases (SLS) are P450 enzymes that catalyze an unusual ring-opening reaction of loganin in the biosynthesis of the MIA precursor secologanin. We report here the identification and characterization in C. roseus of a new isoform of SLS, SLS2, sharing 97 % nucleotide sequence identity with the previously characterized SLS1. We also discovered that both isoforms further oxidize secologanin into secoxyloganin. SLS2 had however a different expression profile, being the major isoform in aerial organs that constitute the main site of MIA accumulation. Unfortunately, we were unable to find a current C. roseus transcriptome database containing simultaneously well reconstructed sequences of SLS isoforms and accurate expression levels. After a pair of close mRNA encoding tabersonine 16-hydroxylase (T16H1 and T16H2), this is the second example of improperly assembled transcripts from the MIA pathway in the public transcriptome databases. To construct a more complete transcriptome resource for C. roseus, we re-processed previously published transcriptome data by combining new single assemblies. Care was particularly taken during clustering and filtering steps to remove redundant contigs but not transcripts encoding potential isoforms by monitoring quality reconstruction of MIA genes and specific SLS and T16H isoforms. The new consensus transcriptome allowed a precise estimation of abundance of SLS and T16H isoforms, similar to qPCR measurements.

Conclusions: The C. roseus consensus transcriptome can now be used for characterization of new genes of the MIA pathway. Furthermore, additional isoforms of genes encoding distinct MIA biosynthetic enzymes isoforms could be predicted suggesting the existence of a higher level of complexity in the synthesis of MIA, raising the question of the evolutionary events behind what seems like redundancy.

No MeSH data available.


Reconstruction quality of MIA genes in different assemblies. Current resources (A = ccOrcae (Smartcell), B = mpgrCra (Medicinal Plant Genomics Resource), C = NIPGR, D = PMS454 (PhytoMetaSyn), E = PMSIllu (PhytoMetaSyn) and new assemblies (19 SRR with PE sequencing design) were used as databases to identify homologs of MIA genes, and the resulting bitscore obtained by BLAST was compared to that of an ideal sequence (bitscore of the reference sequence against itself, i.e. bitscore ratio = 1)
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4541752&req=5

Fig6: Reconstruction quality of MIA genes in different assemblies. Current resources (A = ccOrcae (Smartcell), B = mpgrCra (Medicinal Plant Genomics Resource), C = NIPGR, D = PMS454 (PhytoMetaSyn), E = PMSIllu (PhytoMetaSyn) and new assemblies (19 SRR with PE sequencing design) were used as databases to identify homologs of MIA genes, and the resulting bitscore obtained by BLAST was compared to that of an ideal sequence (bitscore of the reference sequence against itself, i.e. bitscore ratio = 1)

Mentions: We performed a detailed inspection of the current transcriptomic resources, available from Medicinal Plant Genomic Resources [41], PhytoMetaSyn [43] (with Illumina reads or with 454 reads), Cathacyc/Orcae [42] and a newly prepared transcriptome by a NIPGR research team [44]. The corresponding datasets will be thereafter named mpgrCra, PMSIllu (Illumina reads), PMS454 (454 reads), ccOrcae and NIPGR, respectively (Additional file 3: Table S1). Our analysis revealed that correct reconstruction of MIA genes was not systematic. Reference sequences of MIA genes available on NCBI were blasted against those assemblies and the bitscore of best hit was compared to that of an ideal reconstruction, i.e. the bitscore of the reference sequence against itself. On the whole, quality of reconstruction was quite unequal between assemblies (Fig. 6). PMS454 and ccOrcae assemblies displayed the best sequences while PMSIllu was of weaker quality (see for example 10HGO, 16OMT, CMK, HDS, IDI1 and IO). NIPGR and mpgrCra assemblies were quite similar in content, probably due to the construction design of the NIPGR assembly (independent libraries assembled and mixed with mpgrCra before filtering). Classically, discrepancies between assemblies might be due to natural polymorphisms, sequencing and/or reconstruction errors. When looking at very well reconstructed genes such as 7DLH and LAMT, it appeared that small differences are related to single-base variations. For 7DLH, such a variation was observed at the position 564 of the reference sequence (KF415115) in the two assemblies mprgCra and NIPGR where a C was changed to A. This variation could be a true SNP (Single Nucleotide Polymorphism) as the reference sequence was obtained with another cultivar (Little Delicata). Concerning isoforms of T16H (T16H1 and T16H2) and SLS (SLS1 and SLS2), it appeared that current assemblies failed to present high quality sequences of the 4 transcripts (T16H1, T16H2, SLS1 and SLS2) simultaneously. For example, while both SLS isoforms were well reconstructed (bitscore/ideal bitscore >0.99) in PMS454, it was not the case for T16H1 (0.78). The best reconstructions of T16H1 and T16H2 were found in mpgrCra (0.92 for T16H1) and NIPGR (0.92 for T16H2), respectively. This result prompted us to try new assembly strategies in order to produce a more complete transcriptome.Fig. 6


Characterization of a second secologanin synthase isoform producing both secologanin and secoxyloganin allows enhanced de novo assembly of a Catharanthus roseus transcriptome.

Dugé de Bernonville T, Foureau E, Parage C, Lanoue A, Clastre M, Londono MA, Oudin A, Houillé B, Papon N, Besseau S, Glévarec G, Atehortùa L, Giglioli-Guivarc'h N, St-Pierre B, De Luca V, O'Connor SE, Courdavault V - BMC Genomics (2015)

Reconstruction quality of MIA genes in different assemblies. Current resources (A = ccOrcae (Smartcell), B = mpgrCra (Medicinal Plant Genomics Resource), C = NIPGR, D = PMS454 (PhytoMetaSyn), E = PMSIllu (PhytoMetaSyn) and new assemblies (19 SRR with PE sequencing design) were used as databases to identify homologs of MIA genes, and the resulting bitscore obtained by BLAST was compared to that of an ideal sequence (bitscore of the reference sequence against itself, i.e. bitscore ratio = 1)
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4541752&req=5

Fig6: Reconstruction quality of MIA genes in different assemblies. Current resources (A = ccOrcae (Smartcell), B = mpgrCra (Medicinal Plant Genomics Resource), C = NIPGR, D = PMS454 (PhytoMetaSyn), E = PMSIllu (PhytoMetaSyn) and new assemblies (19 SRR with PE sequencing design) were used as databases to identify homologs of MIA genes, and the resulting bitscore obtained by BLAST was compared to that of an ideal sequence (bitscore of the reference sequence against itself, i.e. bitscore ratio = 1)
Mentions: We performed a detailed inspection of the current transcriptomic resources, available from Medicinal Plant Genomic Resources [41], PhytoMetaSyn [43] (with Illumina reads or with 454 reads), Cathacyc/Orcae [42] and a newly prepared transcriptome by a NIPGR research team [44]. The corresponding datasets will be thereafter named mpgrCra, PMSIllu (Illumina reads), PMS454 (454 reads), ccOrcae and NIPGR, respectively (Additional file 3: Table S1). Our analysis revealed that correct reconstruction of MIA genes was not systematic. Reference sequences of MIA genes available on NCBI were blasted against those assemblies and the bitscore of best hit was compared to that of an ideal reconstruction, i.e. the bitscore of the reference sequence against itself. On the whole, quality of reconstruction was quite unequal between assemblies (Fig. 6). PMS454 and ccOrcae assemblies displayed the best sequences while PMSIllu was of weaker quality (see for example 10HGO, 16OMT, CMK, HDS, IDI1 and IO). NIPGR and mpgrCra assemblies were quite similar in content, probably due to the construction design of the NIPGR assembly (independent libraries assembled and mixed with mpgrCra before filtering). Classically, discrepancies between assemblies might be due to natural polymorphisms, sequencing and/or reconstruction errors. When looking at very well reconstructed genes such as 7DLH and LAMT, it appeared that small differences are related to single-base variations. For 7DLH, such a variation was observed at the position 564 of the reference sequence (KF415115) in the two assemblies mprgCra and NIPGR where a C was changed to A. This variation could be a true SNP (Single Nucleotide Polymorphism) as the reference sequence was obtained with another cultivar (Little Delicata). Concerning isoforms of T16H (T16H1 and T16H2) and SLS (SLS1 and SLS2), it appeared that current assemblies failed to present high quality sequences of the 4 transcripts (T16H1, T16H2, SLS1 and SLS2) simultaneously. For example, while both SLS isoforms were well reconstructed (bitscore/ideal bitscore >0.99) in PMS454, it was not the case for T16H1 (0.78). The best reconstructions of T16H1 and T16H2 were found in mpgrCra (0.92 for T16H1) and NIPGR (0.92 for T16H2), respectively. This result prompted us to try new assembly strategies in order to produce a more complete transcriptome.Fig. 6

Bottom Line: The new consensus transcriptome allowed a precise estimation of abundance of SLS and T16H isoforms, similar to qPCR measurements.The C. roseus consensus transcriptome can now be used for characterization of new genes of the MIA pathway.Furthermore, additional isoforms of genes encoding distinct MIA biosynthetic enzymes isoforms could be predicted suggesting the existence of a higher level of complexity in the synthesis of MIA, raising the question of the evolutionary events behind what seems like redundancy.

View Article: PubMed Central - PubMed

Affiliation: Université François-Rabelais de Tours, EA2106 "Biomolécules et Biotechnologies Végétales", UFR Sciences et Techniques, 37200, Tours, France. Bernonvillethomas.duge@univ-tours.fr.

ABSTRACT

Background: Transcriptome sequencing offers a great resource for the study of non-model plants such as Catharanthus roseus, which produces valuable monoterpenoid indole alkaloids (MIAs) via a complex biosynthetic pathway whose characterization is still undergoing. Transcriptome databases dedicated to this plant were recently developed by several consortia to uncover new biosynthetic genes. However, the identification of missing steps in MIA biosynthesis based on these large datasets may be limited by the erroneous assembly of close transcripts and isoforms, even with the multiple available transcriptomes.

Results: Secologanin synthases (SLS) are P450 enzymes that catalyze an unusual ring-opening reaction of loganin in the biosynthesis of the MIA precursor secologanin. We report here the identification and characterization in C. roseus of a new isoform of SLS, SLS2, sharing 97 % nucleotide sequence identity with the previously characterized SLS1. We also discovered that both isoforms further oxidize secologanin into secoxyloganin. SLS2 had however a different expression profile, being the major isoform in aerial organs that constitute the main site of MIA accumulation. Unfortunately, we were unable to find a current C. roseus transcriptome database containing simultaneously well reconstructed sequences of SLS isoforms and accurate expression levels. After a pair of close mRNA encoding tabersonine 16-hydroxylase (T16H1 and T16H2), this is the second example of improperly assembled transcripts from the MIA pathway in the public transcriptome databases. To construct a more complete transcriptome resource for C. roseus, we re-processed previously published transcriptome data by combining new single assemblies. Care was particularly taken during clustering and filtering steps to remove redundant contigs but not transcripts encoding potential isoforms by monitoring quality reconstruction of MIA genes and specific SLS and T16H isoforms. The new consensus transcriptome allowed a precise estimation of abundance of SLS and T16H isoforms, similar to qPCR measurements.

Conclusions: The C. roseus consensus transcriptome can now be used for characterization of new genes of the MIA pathway. Furthermore, additional isoforms of genes encoding distinct MIA biosynthetic enzymes isoforms could be predicted suggesting the existence of a higher level of complexity in the synthesis of MIA, raising the question of the evolutionary events behind what seems like redundancy.

No MeSH data available.