Limits...
R-SAP: a multi-threading computational pipeline for the characterization of high-throughput RNA-sequencing data.

Mittal VK, McDonald JF - Nucleic Acids Res. (2012)

Bottom Line: We present here a user-friendly and fully automated RNA-Seq analysis pipeline (R-SAP) with built-in multi-threading capability to analyze and quantitate high-throughput RNA-Seq datasets.R-SAP follows a hierarchical decision making procedure to accurately characterize various classes of transcripts and achieves a near linear decrease in data processing time as a result of increased multi-threading.In addition, RNA expression level estimates obtained using R-SAP display high concordance with levels measured by microarrays.

View Article: PubMed Central - PubMed

Affiliation: School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.

ABSTRACT
The rapid expansion in the quantity and quality of RNA-Seq data requires the development of sophisticated high-performance bioinformatics tools capable of rapidly transforming this data into meaningful information that is easily interpretable by biologists. Currently available analysis tools are often not easily installed by the general biologist and most of them lack inherent parallel processing capabilities widely recognized as an essential feature of next-generation bioinformatics tools. We present here a user-friendly and fully automated RNA-Seq analysis pipeline (R-SAP) with built-in multi-threading capability to analyze and quantitate high-throughput RNA-Seq datasets. R-SAP follows a hierarchical decision making procedure to accurately characterize various classes of transcripts and achieves a near linear decrease in data processing time as a result of increased multi-threading. In addition, RNA expression level estimates obtained using R-SAP display high concordance with levels measured by microarrays.

Show MeSH

Related in: MedlinePlus

Architecture of R-SAP and data flow in the pipeline. Wrapper script begins the execution of the pipeline and divides the data in to smaller sub-sets. Multiple threads are created and each core module in each thread is run under the ‘Control-module’. Output files are merged by the wrapper script and corresponding output files are written to the disk.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3351179&req=5

gks047-F1: Architecture of R-SAP and data flow in the pipeline. Wrapper script begins the execution of the pipeline and divides the data in to smaller sub-sets. Multiple threads are created and each core module in each thread is run under the ‘Control-module’. Output files are merged by the wrapper script and corresponding output files are written to the disk.

Mentions: R-SAP compares reference genome mappings of RNA-Seq reads with the genomic coordinates of known and well-annotated transcripts (reference transcripts or known transcript models) in order to detect known and new RNA isoforms and, chimeric transcripts. There are four core modules in R-SAP's workflow (Figure 1): (i) initial alignment screening, (ii) characterization with reference transcripts (iii) chimeric transcript detection and (iv) RNA expression quantification. A main wrapper script controls the flow of data to these core modules (Figure 1).Figure 1.


R-SAP: a multi-threading computational pipeline for the characterization of high-throughput RNA-sequencing data.

Mittal VK, McDonald JF - Nucleic Acids Res. (2012)

Architecture of R-SAP and data flow in the pipeline. Wrapper script begins the execution of the pipeline and divides the data in to smaller sub-sets. Multiple threads are created and each core module in each thread is run under the ‘Control-module’. Output files are merged by the wrapper script and corresponding output files are written to the disk.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3351179&req=5

gks047-F1: Architecture of R-SAP and data flow in the pipeline. Wrapper script begins the execution of the pipeline and divides the data in to smaller sub-sets. Multiple threads are created and each core module in each thread is run under the ‘Control-module’. Output files are merged by the wrapper script and corresponding output files are written to the disk.
Mentions: R-SAP compares reference genome mappings of RNA-Seq reads with the genomic coordinates of known and well-annotated transcripts (reference transcripts or known transcript models) in order to detect known and new RNA isoforms and, chimeric transcripts. There are four core modules in R-SAP's workflow (Figure 1): (i) initial alignment screening, (ii) characterization with reference transcripts (iii) chimeric transcript detection and (iv) RNA expression quantification. A main wrapper script controls the flow of data to these core modules (Figure 1).Figure 1.

Bottom Line: We present here a user-friendly and fully automated RNA-Seq analysis pipeline (R-SAP) with built-in multi-threading capability to analyze and quantitate high-throughput RNA-Seq datasets.R-SAP follows a hierarchical decision making procedure to accurately characterize various classes of transcripts and achieves a near linear decrease in data processing time as a result of increased multi-threading.In addition, RNA expression level estimates obtained using R-SAP display high concordance with levels measured by microarrays.

View Article: PubMed Central - PubMed

Affiliation: School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.

ABSTRACT
The rapid expansion in the quantity and quality of RNA-Seq data requires the development of sophisticated high-performance bioinformatics tools capable of rapidly transforming this data into meaningful information that is easily interpretable by biologists. Currently available analysis tools are often not easily installed by the general biologist and most of them lack inherent parallel processing capabilities widely recognized as an essential feature of next-generation bioinformatics tools. We present here a user-friendly and fully automated RNA-Seq analysis pipeline (R-SAP) with built-in multi-threading capability to analyze and quantitate high-throughput RNA-Seq datasets. R-SAP follows a hierarchical decision making procedure to accurately characterize various classes of transcripts and achieves a near linear decrease in data processing time as a result of increased multi-threading. In addition, RNA expression level estimates obtained using R-SAP display high concordance with levels measured by microarrays.

Show MeSH
Related in: MedlinePlus