Limits...
Towards accurate imputation of quantitative genetic interactions.

Ulitsky I, Krogan NJ, Shamir R - Genome Biol. (2009)

Bottom Line: Recent technological breakthroughs have enabled high-throughput quantitative measurements of hundreds of thousands of genetic interactions among hundreds of genes in Saccharomyces cerevisiae.Here we present a novel method, which combines genetic interaction data together with diverse genomic data, to quantitatively impute these missing interactions.We also present data on almost 190,000 novel interactions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel. ulitsky@wi.mit.edu

ABSTRACT
Recent technological breakthroughs have enabled high-throughput quantitative measurements of hundreds of thousands of genetic interactions among hundreds of genes in Saccharomyces cerevisiae. However, these assays often fail to measure the genetic interactions among up to 40% of the studied gene pairs. Here we present a novel method, which combines genetic interaction data together with diverse genomic data, to quantitatively impute these missing interactions. We also present data on almost 190,000 novel interactions.

Show MeSH
Accuracy of qualitative GI prediction. The histograms compare combinations of classifiers and feature sets when seeking a classification of gene pairs into positive, negative and neutral interactions. The combinations are compared in terms of the area under the ROC curve (AUC) and the area under the precision-recall curve (AUPR). (a, b) Predictions of negative interactions, measured by the AUC (a) and AUPR (b). (c, d) Predictions of positive interactions using AUC (c) and AUPR (d). The diffusion kernel method [21] uses only the topology of the GI network and does not exploit the other features.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2812947&req=5

Figure 3: Accuracy of qualitative GI prediction. The histograms compare combinations of classifiers and feature sets when seeking a classification of gene pairs into positive, negative and neutral interactions. The combinations are compared in terms of the area under the ROC curve (AUC) and the area under the precision-recall curve (AUPR). (a, b) Predictions of negative interactions, measured by the AUC (a) and AUPR (b). (c, d) Predictions of positive interactions using AUC (c) and AUPR (d). The diffusion kernel method [21] uses only the topology of the GI network and does not exploit the other features.

Mentions: The results are presented in Figure 3. The best performance was achieved using all the features with the logistic regression or Naïve Bayes classifiers. Using GSG or GSG+MATRIX features, it was possible to obtain near-optimal classification accuracy, and these features significantly outperformed classifiers using only network or genomic properties, which were used in previous studies. The G- diffusion kernel was indeed very powerful in predicting negative interactions, especially given the amount of information it used (only the synthetic lethal interactions). However, the G+ kernel performed rather poorly in predicting positive interactions. In general, the prediction of negative GIs appears to be easier than the prediction of positive GIs, since most methods fared much better on the former task. The higher difficulty of predicting positive interactions was manifested for a variety of S-score thresholds used to define those interactions, as the AUPR for prediction of positive interactions did not exceed 0.25 for any threshold (Figure S4 in Additional file 1).


Towards accurate imputation of quantitative genetic interactions.

Ulitsky I, Krogan NJ, Shamir R - Genome Biol. (2009)

Accuracy of qualitative GI prediction. The histograms compare combinations of classifiers and feature sets when seeking a classification of gene pairs into positive, negative and neutral interactions. The combinations are compared in terms of the area under the ROC curve (AUC) and the area under the precision-recall curve (AUPR). (a, b) Predictions of negative interactions, measured by the AUC (a) and AUPR (b). (c, d) Predictions of positive interactions using AUC (c) and AUPR (d). The diffusion kernel method [21] uses only the topology of the GI network and does not exploit the other features.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2812947&req=5

Figure 3: Accuracy of qualitative GI prediction. The histograms compare combinations of classifiers and feature sets when seeking a classification of gene pairs into positive, negative and neutral interactions. The combinations are compared in terms of the area under the ROC curve (AUC) and the area under the precision-recall curve (AUPR). (a, b) Predictions of negative interactions, measured by the AUC (a) and AUPR (b). (c, d) Predictions of positive interactions using AUC (c) and AUPR (d). The diffusion kernel method [21] uses only the topology of the GI network and does not exploit the other features.
Mentions: The results are presented in Figure 3. The best performance was achieved using all the features with the logistic regression or Naïve Bayes classifiers. Using GSG or GSG+MATRIX features, it was possible to obtain near-optimal classification accuracy, and these features significantly outperformed classifiers using only network or genomic properties, which were used in previous studies. The G- diffusion kernel was indeed very powerful in predicting negative interactions, especially given the amount of information it used (only the synthetic lethal interactions). However, the G+ kernel performed rather poorly in predicting positive interactions. In general, the prediction of negative GIs appears to be easier than the prediction of positive GIs, since most methods fared much better on the former task. The higher difficulty of predicting positive interactions was manifested for a variety of S-score thresholds used to define those interactions, as the AUPR for prediction of positive interactions did not exceed 0.25 for any threshold (Figure S4 in Additional file 1).

Bottom Line: Recent technological breakthroughs have enabled high-throughput quantitative measurements of hundreds of thousands of genetic interactions among hundreds of genes in Saccharomyces cerevisiae.Here we present a novel method, which combines genetic interaction data together with diverse genomic data, to quantitatively impute these missing interactions.We also present data on almost 190,000 novel interactions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel. ulitsky@wi.mit.edu

ABSTRACT
Recent technological breakthroughs have enabled high-throughput quantitative measurements of hundreds of thousands of genetic interactions among hundreds of genes in Saccharomyces cerevisiae. However, these assays often fail to measure the genetic interactions among up to 40% of the studied gene pairs. Here we present a novel method, which combines genetic interaction data together with diverse genomic data, to quantitatively impute these missing interactions. We also present data on almost 190,000 novel interactions.

Show MeSH