Limits...
Probabilistic inference of biological networks via data integration.

Rogers MF, Campbell C, Ying Y - Biomed Res Int (2015)

Bottom Line: There is significant interest in inferring the structure of subcellular networks of interaction.Although one pairwise kernel (the tensor product pairwise kernel) appears to work best, different kernels may contribute complimentary information about interactions: experiments in S. cerevisiae (yeast) reveal that a weighted combination of pairwise kernels applied to different types of data yields the highest predictive accuracy.Combined with cautious classification and data cleaning, we can achieve predictive accuracies of up to 99.6%.

View Article: PubMed Central - PubMed

Affiliation: Intelligent Systems Laboratory, University of Bristol, Merchant Venturers Building, Bristol BS8 1UB, UK.

ABSTRACT
There is significant interest in inferring the structure of subcellular networks of interaction. Here we consider supervised interactive network inference in which a reference set of known network links and nonlinks is used to train a classifier for predicting new links. Many types of data are relevant to inferring functional links between genes, motivating the use of data integration. We use pairwise kernels to predict novel links, along with multiple kernel learning to integrate distinct sources of data into a decision function. We evaluate various pairwise kernels to establish which are most informative and compare individual kernel accuracies with accuracies for weighted combinations. By associating a probability measure with classifier predictions, we enable cautious classification, which can increase accuracy by restricting predictions to high-confidence instances, and data cleaning that can mitigate the influence of mislabeled training instances. Although one pairwise kernel (the tensor product pairwise kernel) appears to work best, different kernels may contribute complimentary information about interactions: experiments in S. cerevisiae (yeast) reveal that a weighted combination of pairwise kernels applied to different types of data yields the highest predictive accuracy. Combined with cautious classification and data cleaning, we can achieve predictive accuracies of up to 99.6%.

Show MeSH
Graphical depiction showing the typical improvement in accuracy we see when using a weighted sum of base kernels via MKL. Here, we compare the average performance of the best-performing composite kernel,  (solid grey bars), with the corresponding base kernels (hashed bars) on data sets of three different sizes. By leveraging information from multiple kernels,  provides an accuracy increase of 4% to 5% over the best of the base kernels. When we use MKL over all 30 base kernels combined (), we achieve a further 1.2% to 1.4% increase (black bars). Differences between  and its base kernels are significant at α < 0.001; differences between  and  are significant at α < 0.01.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4385617&req=5

fig2: Graphical depiction showing the typical improvement in accuracy we see when using a weighted sum of base kernels via MKL. Here, we compare the average performance of the best-performing composite kernel, (solid grey bars), with the corresponding base kernels (hashed bars) on data sets of three different sizes. By leveraging information from multiple kernels, provides an accuracy increase of 4% to 5% over the best of the base kernels. When we use MKL over all 30 base kernels combined (), we achieve a further 1.2% to 1.4% increase (black bars). Differences between and its base kernels are significant at α < 0.001; differences between and are significant at α < 0.01.

Mentions: Secondly, we compared the relative performance of these composite MKL kernels with their corresponding base kernels. We ran the same experiment outlined above on the individual base kernels. In general, we see a significant difference between the MKL-weighted kernels and their individual base kernels. For example, the top-performing combined kernel yields accuracy that is at least 4% higher than the nearest corresponding base kernel (Figure 2). We note that the weights used for the constituent kernels roughly track the relative performance of the kernels: for example, KP and KMS yield the highest accuracy and also have the largest weights for (see Table 1), while the two weakest base kernels, KM and KYH, have zero weights and do not contribute to the final composite kernel.


Probabilistic inference of biological networks via data integration.

Rogers MF, Campbell C, Ying Y - Biomed Res Int (2015)

Graphical depiction showing the typical improvement in accuracy we see when using a weighted sum of base kernels via MKL. Here, we compare the average performance of the best-performing composite kernel,  (solid grey bars), with the corresponding base kernels (hashed bars) on data sets of three different sizes. By leveraging information from multiple kernels,  provides an accuracy increase of 4% to 5% over the best of the base kernels. When we use MKL over all 30 base kernels combined (), we achieve a further 1.2% to 1.4% increase (black bars). Differences between  and its base kernels are significant at α < 0.001; differences between  and  are significant at α < 0.01.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4385617&req=5

fig2: Graphical depiction showing the typical improvement in accuracy we see when using a weighted sum of base kernels via MKL. Here, we compare the average performance of the best-performing composite kernel, (solid grey bars), with the corresponding base kernels (hashed bars) on data sets of three different sizes. By leveraging information from multiple kernels, provides an accuracy increase of 4% to 5% over the best of the base kernels. When we use MKL over all 30 base kernels combined (), we achieve a further 1.2% to 1.4% increase (black bars). Differences between and its base kernels are significant at α < 0.001; differences between and are significant at α < 0.01.
Mentions: Secondly, we compared the relative performance of these composite MKL kernels with their corresponding base kernels. We ran the same experiment outlined above on the individual base kernels. In general, we see a significant difference between the MKL-weighted kernels and their individual base kernels. For example, the top-performing combined kernel yields accuracy that is at least 4% higher than the nearest corresponding base kernel (Figure 2). We note that the weights used for the constituent kernels roughly track the relative performance of the kernels: for example, KP and KMS yield the highest accuracy and also have the largest weights for (see Table 1), while the two weakest base kernels, KM and KYH, have zero weights and do not contribute to the final composite kernel.

Bottom Line: There is significant interest in inferring the structure of subcellular networks of interaction.Although one pairwise kernel (the tensor product pairwise kernel) appears to work best, different kernels may contribute complimentary information about interactions: experiments in S. cerevisiae (yeast) reveal that a weighted combination of pairwise kernels applied to different types of data yields the highest predictive accuracy.Combined with cautious classification and data cleaning, we can achieve predictive accuracies of up to 99.6%.

View Article: PubMed Central - PubMed

Affiliation: Intelligent Systems Laboratory, University of Bristol, Merchant Venturers Building, Bristol BS8 1UB, UK.

ABSTRACT
There is significant interest in inferring the structure of subcellular networks of interaction. Here we consider supervised interactive network inference in which a reference set of known network links and nonlinks is used to train a classifier for predicting new links. Many types of data are relevant to inferring functional links between genes, motivating the use of data integration. We use pairwise kernels to predict novel links, along with multiple kernel learning to integrate distinct sources of data into a decision function. We evaluate various pairwise kernels to establish which are most informative and compare individual kernel accuracies with accuracies for weighted combinations. By associating a probability measure with classifier predictions, we enable cautious classification, which can increase accuracy by restricting predictions to high-confidence instances, and data cleaning that can mitigate the influence of mislabeled training instances. Although one pairwise kernel (the tensor product pairwise kernel) appears to work best, different kernels may contribute complimentary information about interactions: experiments in S. cerevisiae (yeast) reveal that a weighted combination of pairwise kernels applied to different types of data yields the highest predictive accuracy. Combined with cautious classification and data cleaning, we can achieve predictive accuracies of up to 99.6%.

Show MeSH