Limits...
Genome-wide identification and functional prediction of cold and/or drought-responsive lncRNAs in cassava

View Article: PubMed Central - PubMed

ABSTRACT

Cold and drought stresses seriously affect cassava (Manihot esculenta) plant growth and yield. Recently, long noncoding RNAs (lncRNAs) have emerged as key regulators of diverse cellular processes in mammals and plants. To date, no systematic screening of lncRNAs under abiotic stress and their regulatory roles in cassava has been reported. In this study, we present the first reference catalog of 682 high-confidence lncRNAs based on analysis of strand-specific RNA-seq data from cassava shoot apices and young leaves under cold, drought stress and control conditions. Among them, 16 lncRNAs were identified as putative target mimics of cassava known miRNAs. Additionally, by comparing with small RNA-seq data, we found 42 lncNATs and sense gene pairs can generate nat-siRNAs. We identified 318 lncRNAs responsive to cold and/or drought stress, which were typically co-expressed concordantly or discordantly with their neighboring genes. Trans-regulatory network analysis suggested that many lncRNAs were associated with hormone signal transduction, secondary metabolites biosynthesis, and sucrose metabolism pathway. The study provides an opportunity for future computational and experimental studies to uncover the functions of lncRNAs in cassava.

No MeSH data available.


An integrative computational pipeline for the systematic identification of lncRNAs in cassava.CPAT, Coding-Potential Assessment Tool; CNCI, Coding-Non-Coding Index; CPC, Coding Potential Calculator; lincRNA, long intergenic non-coding RNA; lncNAT, long non-coding natural antisense transcript.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5384091&req=5

f1: An integrative computational pipeline for the systematic identification of lncRNAs in cassava.CPAT, Coding-Potential Assessment Tool; CNCI, Coding-Non-Coding Index; CPC, Coding Potential Calculator; lincRNA, long intergenic non-coding RNA; lncNAT, long non-coding natural antisense transcript.

Mentions: In total, 140 gigabases (Gb) raw reads of 125-bp length were generated of 9 samples by paired end sequencing with Illumina HiSeq 2500 machine. After trimming adapters and filtering out low quality reads, approximately 116 million clean reads with 62–64% mapping to the Manihot esculenta genome were obtained and were used for further analysis. To characterize cassava lncRNAs, we developed a computational identification pipeline based on whole transcriptome ssRNA-seq data (Fig. 1). The cassava transcriptome was reconstructed from all of the RNA-seq datasets using cufflink 2.037. A total of ~7.2 million transcripts were obtained among 9 samples. Four filter processes were applied to distinguish lncRNAs from protein-coding transcript units. First, we removed transcripts that were overlapping with known protein-coding genes in sense. We totally discovered 76,069 transcripts and most of the transcripts (63.08%) were mRNAs. Second, we filtered transcripts with length <200nt. Then, we evaluated the coding potential of the remaining transcripts and obtained novel expressed lncRNAs. We used the Coding Potential Calculator (CPC)38, Coding-Potential Assessment Tool (CPAT)39 and Coding-Non-Coding Index (CNCI)40 to predict the coding potential of each transcript. All transcripts with scores >0 were discarded. To guarantee the thorough elimination of protein-coding transcripts, we also employed HMMER to scan each transcript to exclude transcripts that encoded any of the known protein domains cataloged in the Pfam protein family database4142. Finally, after filtering out those FPKM (fragments per kilobase of transcript per million mapped reads)43 scores <0.5, which indicated infrequently expressed transcripts, and transcripts contained in only one sample, we obtained 682 reliably expressed novel lncRNAs, including 453 lincRNAs and 229 lncNATs (Supplemental Data S1). In addition, we aligned the 682 lncRNAs with GreeNC database, and found all these lncRNAs have not been annotated in cassava.


Genome-wide identification and functional prediction of cold and/or drought-responsive lncRNAs in cassava
An integrative computational pipeline for the systematic identification of lncRNAs in cassava.CPAT, Coding-Potential Assessment Tool; CNCI, Coding-Non-Coding Index; CPC, Coding Potential Calculator; lincRNA, long intergenic non-coding RNA; lncNAT, long non-coding natural antisense transcript.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5384091&req=5

f1: An integrative computational pipeline for the systematic identification of lncRNAs in cassava.CPAT, Coding-Potential Assessment Tool; CNCI, Coding-Non-Coding Index; CPC, Coding Potential Calculator; lincRNA, long intergenic non-coding RNA; lncNAT, long non-coding natural antisense transcript.
Mentions: In total, 140 gigabases (Gb) raw reads of 125-bp length were generated of 9 samples by paired end sequencing with Illumina HiSeq 2500 machine. After trimming adapters and filtering out low quality reads, approximately 116 million clean reads with 62–64% mapping to the Manihot esculenta genome were obtained and were used for further analysis. To characterize cassava lncRNAs, we developed a computational identification pipeline based on whole transcriptome ssRNA-seq data (Fig. 1). The cassava transcriptome was reconstructed from all of the RNA-seq datasets using cufflink 2.037. A total of ~7.2 million transcripts were obtained among 9 samples. Four filter processes were applied to distinguish lncRNAs from protein-coding transcript units. First, we removed transcripts that were overlapping with known protein-coding genes in sense. We totally discovered 76,069 transcripts and most of the transcripts (63.08%) were mRNAs. Second, we filtered transcripts with length <200nt. Then, we evaluated the coding potential of the remaining transcripts and obtained novel expressed lncRNAs. We used the Coding Potential Calculator (CPC)38, Coding-Potential Assessment Tool (CPAT)39 and Coding-Non-Coding Index (CNCI)40 to predict the coding potential of each transcript. All transcripts with scores >0 were discarded. To guarantee the thorough elimination of protein-coding transcripts, we also employed HMMER to scan each transcript to exclude transcripts that encoded any of the known protein domains cataloged in the Pfam protein family database4142. Finally, after filtering out those FPKM (fragments per kilobase of transcript per million mapped reads)43 scores <0.5, which indicated infrequently expressed transcripts, and transcripts contained in only one sample, we obtained 682 reliably expressed novel lncRNAs, including 453 lincRNAs and 229 lncNATs (Supplemental Data S1). In addition, we aligned the 682 lncRNAs with GreeNC database, and found all these lncRNAs have not been annotated in cassava.

View Article: PubMed Central - PubMed

ABSTRACT

Cold and drought stresses seriously affect cassava (Manihot esculenta) plant growth and yield. Recently, long noncoding RNAs (lncRNAs) have emerged as key regulators of diverse cellular processes in mammals and plants. To date, no systematic screening of lncRNAs under abiotic stress and their regulatory roles in cassava has been reported. In this study, we present the first reference catalog of 682 high-confidence lncRNAs based on analysis of strand-specific RNA-seq data from cassava shoot apices and young leaves under cold, drought stress and control conditions. Among them, 16 lncRNAs were identified as putative target mimics of cassava known miRNAs. Additionally, by comparing with small RNA-seq data, we found 42 lncNATs and sense gene pairs can generate nat-siRNAs. We identified 318 lncRNAs responsive to cold and/or drought stress, which were typically co-expressed concordantly or discordantly with their neighboring genes. Trans-regulatory network analysis suggested that many lncRNAs were associated with hormone signal transduction, secondary metabolites biosynthesis, and sucrose metabolism pathway. The study provides an opportunity for future computational and experimental studies to uncover the functions of lncRNAs in cassava.

No MeSH data available.