Limits...
Quality Control for RNA-Seq (QuaCRS): An Integrated Quality Control Pipeline.

Kroll KW, Mokaram NE, Pelletier AR, Frankhouser DE, Westphal MS, Stump PA, Stump CL, Bundschuh R, Blachly JS, Yan P - Cancer Inform (2014)

Bottom Line: Combining these three tools into one wrapper provides increased ease of use and provides a much more complete view of sample data quality than any individual tool.Second is the QC database, which displays the resulting metrics in a user-friendly web interface.The structure of the QuaCRS database is designed to enable expansion with additional tools and metrics in the future.

View Article: PubMed Central - PubMed

Affiliation: Department of Internal Medicine, Division of Hematology, Ohio State University Comprehensive Cancer Center, Columbus, OH, USA.

ABSTRACT
QuaCRS (Quality Control for RNA-Seq) is an integrated, simplified quality control (QC) system for RNA-seq data that allows easy execution of several open-source QC tools, aggregation of their output, and the ability to quickly identify quality issues by performing meta-analyses on QC metrics across large numbers of samples in different studies. It comprises two main sections. First is the QC Pack wrapper, which executes three QC tools: FastQC, RNA-SeQC, and selected functions from RSeQC. Combining these three tools into one wrapper provides increased ease of use and provides a much more complete view of sample data quality than any individual tool. Second is the QC database, which displays the resulting metrics in a user-friendly web interface. It was designed to allow users with less computational experience to easily generate and view QC information for their data, to investigate individual samples and aggregate reports of sample groups, and to sort and search samples based on quality. The structure of the QuaCRS database is designed to enable expansion with additional tools and metrics in the future. The source code for not-for-profit use and a fully functional sample user interface with mock data are available at http://bioserv.mps.ohio-state.edu/QuaCRS/.

No MeSH data available.


Workflow executed by QuaCRS. Raw data, aligned data, and additional metadata are provided as input to the QC Pack program. Executing QC Pack will run FastQC, RNA-SeQC, and RSeQC, and create a composite table containing the resulting metrics and image file paths. This table is then uploaded to the database using the database reader, after which it can be viewed with the web interface and for report generations based on individual samples or aggregates.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4214596&req=5

f1-cin-suppl.3-2014-007: Workflow executed by QuaCRS. Raw data, aligned data, and additional metadata are provided as input to the QC Pack program. Executing QC Pack will run FastQC, RNA-SeQC, and RSeQC, and create a composite table containing the resulting metrics and image file paths. This table is then uploaded to the database using the database reader, after which it can be viewed with the web interface and for report generations based on individual samples or aggregates.

Mentions: The entry point to QuaCRS is a wrapper called QC Pack (see Fig. 1). It runs the three selected RNA-Seq QC tools currently used in our core: FastQC, RNA-SeQC, and RSeQC. This is also the entry point for both the obligatory metadata and other optional metadata deemed necessary for downstream data analyses. To launch the QuaCRS pipeline, a sample configuration file needs to be populated with the obligatory metadata, namely information needed to generate a searchable unique identifier for each sample in the QuaCRS database. This identifier is composed of the sample name, the study name, the start date of the sequencing run, and the lane number in which the sample was sequenced. Although no two samples would have the same combination of these four parameters, often the same sample from the same library preparation will be sequenced more than once (ie, different dates and lanes) for the purpose of achieving a predetermined number of sequenced reads. QuaCRS handles this scenario through another required field labeled “Run Description.” This field marks data entries as combined if they represent data compiled from two or more sequencing runs that constitute the final data input for a sample, prior to downstream analyses. As such, these “combined” samples are not associated with date and lane information. The actual sequence read files for these “combined” samples are generated prior to their entry to the QC Pack, as QuaCRS does not perform the function of merging multiple sample data files. In addition to the obligatory metadata described above, other information important to downstream analyses but not to sample identification can also be input into QuaCRS. Examples of these optional metadata include total RNA extraction protocol, RNA-seq library generation protocol, quality metric of input total RNA for library generation (RNA Integrity Number or RIN),24 and sample barcode information. Once the QC Pack configuration file is populated, the QC Pack script is ready for the next step.


Quality Control for RNA-Seq (QuaCRS): An Integrated Quality Control Pipeline.

Kroll KW, Mokaram NE, Pelletier AR, Frankhouser DE, Westphal MS, Stump PA, Stump CL, Bundschuh R, Blachly JS, Yan P - Cancer Inform (2014)

Workflow executed by QuaCRS. Raw data, aligned data, and additional metadata are provided as input to the QC Pack program. Executing QC Pack will run FastQC, RNA-SeQC, and RSeQC, and create a composite table containing the resulting metrics and image file paths. This table is then uploaded to the database using the database reader, after which it can be viewed with the web interface and for report generations based on individual samples or aggregates.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4214596&req=5

f1-cin-suppl.3-2014-007: Workflow executed by QuaCRS. Raw data, aligned data, and additional metadata are provided as input to the QC Pack program. Executing QC Pack will run FastQC, RNA-SeQC, and RSeQC, and create a composite table containing the resulting metrics and image file paths. This table is then uploaded to the database using the database reader, after which it can be viewed with the web interface and for report generations based on individual samples or aggregates.
Mentions: The entry point to QuaCRS is a wrapper called QC Pack (see Fig. 1). It runs the three selected RNA-Seq QC tools currently used in our core: FastQC, RNA-SeQC, and RSeQC. This is also the entry point for both the obligatory metadata and other optional metadata deemed necessary for downstream data analyses. To launch the QuaCRS pipeline, a sample configuration file needs to be populated with the obligatory metadata, namely information needed to generate a searchable unique identifier for each sample in the QuaCRS database. This identifier is composed of the sample name, the study name, the start date of the sequencing run, and the lane number in which the sample was sequenced. Although no two samples would have the same combination of these four parameters, often the same sample from the same library preparation will be sequenced more than once (ie, different dates and lanes) for the purpose of achieving a predetermined number of sequenced reads. QuaCRS handles this scenario through another required field labeled “Run Description.” This field marks data entries as combined if they represent data compiled from two or more sequencing runs that constitute the final data input for a sample, prior to downstream analyses. As such, these “combined” samples are not associated with date and lane information. The actual sequence read files for these “combined” samples are generated prior to their entry to the QC Pack, as QuaCRS does not perform the function of merging multiple sample data files. In addition to the obligatory metadata described above, other information important to downstream analyses but not to sample identification can also be input into QuaCRS. Examples of these optional metadata include total RNA extraction protocol, RNA-seq library generation protocol, quality metric of input total RNA for library generation (RNA Integrity Number or RIN),24 and sample barcode information. Once the QC Pack configuration file is populated, the QC Pack script is ready for the next step.

Bottom Line: Combining these three tools into one wrapper provides increased ease of use and provides a much more complete view of sample data quality than any individual tool.Second is the QC database, which displays the resulting metrics in a user-friendly web interface.The structure of the QuaCRS database is designed to enable expansion with additional tools and metrics in the future.

View Article: PubMed Central - PubMed

Affiliation: Department of Internal Medicine, Division of Hematology, Ohio State University Comprehensive Cancer Center, Columbus, OH, USA.

ABSTRACT
QuaCRS (Quality Control for RNA-Seq) is an integrated, simplified quality control (QC) system for RNA-seq data that allows easy execution of several open-source QC tools, aggregation of their output, and the ability to quickly identify quality issues by performing meta-analyses on QC metrics across large numbers of samples in different studies. It comprises two main sections. First is the QC Pack wrapper, which executes three QC tools: FastQC, RNA-SeQC, and selected functions from RSeQC. Combining these three tools into one wrapper provides increased ease of use and provides a much more complete view of sample data quality than any individual tool. Second is the QC database, which displays the resulting metrics in a user-friendly web interface. It was designed to allow users with less computational experience to easily generate and view QC information for their data, to investigate individual samples and aggregate reports of sample groups, and to sort and search samples based on quality. The structure of the QuaCRS database is designed to enable expansion with additional tools and metrics in the future. The source code for not-for-profit use and a fully functional sample user interface with mock data are available at http://bioserv.mps.ohio-state.edu/QuaCRS/.

No MeSH data available.