Limits...
gespeR: a statistical model for deconvoluting off-target-confounded RNA interference screens.

Schmich F, Szczurek E, Kreibich S, Dilling S, Andritschke D, Casanova A, Low SH, Eicher S, Muntwiler S, Emmenlauer M, Rämö P, Conde-Alvarez R, von Mering C, Hardt WD, Dehio C, Beerenwinkel N - Genome Biol. (2015)

Bottom Line: Small interfering RNAs (siRNAs) exhibit strong off-target effects, which confound the gene-level interpretation of RNA interference screens and thus limit their utility for functional genomics studies.Here, we present gespeR, a statistical model for reconstructing individual, gene-specific phenotypes.Genes selected and prioritized by gespeR are validated and shown to constitute biologically relevant components of pathogen entry mechanisms and TGF-β signaling. gespeR is available as a Bioconductor R-package.

View Article: PubMed Central - PubMed

Affiliation: Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland. fabian.schmich@bsse.ethz.ch.

ABSTRACT
Small interfering RNAs (siRNAs) exhibit strong off-target effects, which confound the gene-level interpretation of RNA interference screens and thus limit their utility for functional genomics studies. Here, we present gespeR, a statistical model for reconstructing individual, gene-specific phenotypes. Using 115,878 siRNAs, single and pooled, from three companies in three pathogen infection screens, we demonstrate that deconvolution of image-based phenotypes substantially improves the reproducibility between independent siRNA sets targeting the same genes. Genes selected and prioritized by gespeR are validated and shown to constitute biologically relevant components of pathogen entry mechanisms and TGF-β signaling. gespeR is available as a Bioconductor R-package.

Show MeSH

Related in: MedlinePlus

gespeR predicts siRNA phenotypes with significantly higher accuracy than in silico pooling (ISP) and haystack across all pathogens. Mutual concordance is evaluated between predicted and measured reagent-specific phenotypes (RSPs) for the same siRNAs. *Significantly better than second best method (Wilcoxon rank sum test, p < 0.05). a Phenotypes for 1871 validation screen siRNAs from Ambion were predicted in a blind test prior to experiments and evaluated against eventually measured RSPs. b Subsetting seven data points for the kinome-wide data set, RSPs were repeatedly predicted for a training set and evaluated against a disjoint test set
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4597449&req=5

Fig2: gespeR predicts siRNA phenotypes with significantly higher accuracy than in silico pooling (ISP) and haystack across all pathogens. Mutual concordance is evaluated between predicted and measured reagent-specific phenotypes (RSPs) for the same siRNAs. *Significantly better than second best method (Wilcoxon rank sum test, p < 0.05). a Phenotypes for 1871 validation screen siRNAs from Ambion were predicted in a blind test prior to experiments and evaluated against eventually measured RSPs. b Subsetting seven data points for the kinome-wide data set, RSPs were repeatedly predicted for a training set and evaluated against a disjoint test set

Mentions: In order to assess the predictive power of our model in a blind test, we predicted combinatorial RSPs for 1871 previously unseen Ambion validation screen siRNAs prior to the validation experiment, and seven kinome-wide data sets. Using gespeR, we first estimated gene-specific phenotypes for all pathogens from the joint set of 90,264 Qiagen siRNAs, denoted GSPQ, and 18,041 Dharmacon siRNA pools, denoted GSPD. GSPQs and GSPDs were then independently used to compute the expected phenotypes of the unseen screens (Additional file 3). In this analysis, we compared the predictive performance of gespeR only with ISP of RSPs and haystack, because RSA cannot predict RSPs (see “Limited applicability of RSA and haystack” section in Additional file 4). Haystack operates on seed-averaged phenotypes and not on full siRNA phenotypes. Therefore, as an approximation to the prediction of the RSP for a specific siRNA using the haystack model, we predicted matching seed-phenotypes based on haystack’s phenotype estimates from the same Qiagen data set. Performance was evaluated by measuring concordance of predicted against measured RSPs. For the prediction of the new, unseen siRNA phenotypes, gespeR showed higher concordance than predictions from both haystack and ISP as evaluated by correlation between phenotypes and rank-biased overlap of ranked gene lists across all pathogens (Fig. 2a). The predictive performance was stronger for GSPQs for all pathogens, except for S. typhimurium, where GSPDs performed slightly better, likely due to strain-specific effects from the Qiagen genome-wide screen for S. typhimurium.Fig. 2


gespeR: a statistical model for deconvoluting off-target-confounded RNA interference screens.

Schmich F, Szczurek E, Kreibich S, Dilling S, Andritschke D, Casanova A, Low SH, Eicher S, Muntwiler S, Emmenlauer M, Rämö P, Conde-Alvarez R, von Mering C, Hardt WD, Dehio C, Beerenwinkel N - Genome Biol. (2015)

gespeR predicts siRNA phenotypes with significantly higher accuracy than in silico pooling (ISP) and haystack across all pathogens. Mutual concordance is evaluated between predicted and measured reagent-specific phenotypes (RSPs) for the same siRNAs. *Significantly better than second best method (Wilcoxon rank sum test, p < 0.05). a Phenotypes for 1871 validation screen siRNAs from Ambion were predicted in a blind test prior to experiments and evaluated against eventually measured RSPs. b Subsetting seven data points for the kinome-wide data set, RSPs were repeatedly predicted for a training set and evaluated against a disjoint test set
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4597449&req=5

Fig2: gespeR predicts siRNA phenotypes with significantly higher accuracy than in silico pooling (ISP) and haystack across all pathogens. Mutual concordance is evaluated between predicted and measured reagent-specific phenotypes (RSPs) for the same siRNAs. *Significantly better than second best method (Wilcoxon rank sum test, p < 0.05). a Phenotypes for 1871 validation screen siRNAs from Ambion were predicted in a blind test prior to experiments and evaluated against eventually measured RSPs. b Subsetting seven data points for the kinome-wide data set, RSPs were repeatedly predicted for a training set and evaluated against a disjoint test set
Mentions: In order to assess the predictive power of our model in a blind test, we predicted combinatorial RSPs for 1871 previously unseen Ambion validation screen siRNAs prior to the validation experiment, and seven kinome-wide data sets. Using gespeR, we first estimated gene-specific phenotypes for all pathogens from the joint set of 90,264 Qiagen siRNAs, denoted GSPQ, and 18,041 Dharmacon siRNA pools, denoted GSPD. GSPQs and GSPDs were then independently used to compute the expected phenotypes of the unseen screens (Additional file 3). In this analysis, we compared the predictive performance of gespeR only with ISP of RSPs and haystack, because RSA cannot predict RSPs (see “Limited applicability of RSA and haystack” section in Additional file 4). Haystack operates on seed-averaged phenotypes and not on full siRNA phenotypes. Therefore, as an approximation to the prediction of the RSP for a specific siRNA using the haystack model, we predicted matching seed-phenotypes based on haystack’s phenotype estimates from the same Qiagen data set. Performance was evaluated by measuring concordance of predicted against measured RSPs. For the prediction of the new, unseen siRNA phenotypes, gespeR showed higher concordance than predictions from both haystack and ISP as evaluated by correlation between phenotypes and rank-biased overlap of ranked gene lists across all pathogens (Fig. 2a). The predictive performance was stronger for GSPQs for all pathogens, except for S. typhimurium, where GSPDs performed slightly better, likely due to strain-specific effects from the Qiagen genome-wide screen for S. typhimurium.Fig. 2

Bottom Line: Small interfering RNAs (siRNAs) exhibit strong off-target effects, which confound the gene-level interpretation of RNA interference screens and thus limit their utility for functional genomics studies.Here, we present gespeR, a statistical model for reconstructing individual, gene-specific phenotypes.Genes selected and prioritized by gespeR are validated and shown to constitute biologically relevant components of pathogen entry mechanisms and TGF-β signaling. gespeR is available as a Bioconductor R-package.

View Article: PubMed Central - PubMed

Affiliation: Department of Biosystems Science and Engineering, ETH, Zurich, Switzerland. fabian.schmich@bsse.ethz.ch.

ABSTRACT
Small interfering RNAs (siRNAs) exhibit strong off-target effects, which confound the gene-level interpretation of RNA interference screens and thus limit their utility for functional genomics studies. Here, we present gespeR, a statistical model for reconstructing individual, gene-specific phenotypes. Using 115,878 siRNAs, single and pooled, from three companies in three pathogen infection screens, we demonstrate that deconvolution of image-based phenotypes substantially improves the reproducibility between independent siRNA sets targeting the same genes. Genes selected and prioritized by gespeR are validated and shown to constitute biologically relevant components of pathogen entry mechanisms and TGF-β signaling. gespeR is available as a Bioconductor R-package.

Show MeSH
Related in: MedlinePlus