Limits...
RecRWR: a recursive random walk method for improved identification of diseases.

Arrais JP, Oliveira JL - Biomed Res Int (2015)

Bottom Line: An open problem for these methods is the ability to combine and take advantage of the wealth of biomedical data publicly available.We present a comprehensive validation that demonstrates the advantage of the proposed approach, Recursive Random Walk with Restarts (RecRWR).The obtained results outline the superiority of the proposed approach, RecRWR, in identifying disease candidates, especially with high levels of biological noise and benefiting from all data available.

View Article: PubMed Central - PubMed

Affiliation: Department of Informatics Engineering (DEI), Centre for Informatics and Systems of the University of Coimbra (CISUC), University of Coimbra, 3030-290 Coimbra, Portugal.

ABSTRACT
High-throughput methods such as next-generation sequencing or DNA microarrays lack precision, as they return hundreds of genes for a single disease profile. Several computational methods applied to physical interaction of protein networks have been successfully used in identification of the best disease candidates for each expression profile. An open problem for these methods is the ability to combine and take advantage of the wealth of biomedical data publicly available. We propose an enhanced method to improve selection of the best disease targets for a multilayer biomedical network that integrates PPI data annotated with stable knowledge from OMIM diseases and GO biological processes. We present a comprehensive validation that demonstrates the advantage of the proposed approach, Recursive Random Walk with Restarts (RecRWR). The obtained results outline the superiority of the proposed approach, RecRWR, in identifying disease candidates, especially with high levels of biological noise and benefiting from all data available.

Show MeSH

Related in: MedlinePlus

Comparison of the RWR method using PPI data and PPI enriched in biological terms.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4385608&req=5

fig1: Comparison of the RWR method using PPI data and PPI enriched in biological terms.

Mentions: Previous use of RWR on molecular biology typically concentrates on PPI networks. One would expect that including additional data would contribute to an improved overall result. Figure 1 presents a comparison of the relative frequency of the ranks for each of the analysed datasets, for two of the tested methods (RWR over only PPI data and RWR over the whole network) and for four levels of randomness. From analysis of this graph it is clear there is no improvement with including external annotations on the original PPI network. Indeed for original dataset, with random effect, there are no perceptible differences between the two methods. This statement is even sharper when we test progressive levels of randomness. For instance, when 20% of the genes on the dataset are random, 55% of the RWR over PPI ranks the disease in the top 3, while with the RWR over all data this frequency drops to 48%. For 60% randomness, 35% of the RWR over PPI ranks the disease in the top 5, while with the RWR over all data the frequency drops to 23%. These results were the primary motivation for the work presented in this paper, as they clearly show that the RWR method is not suitable for dealing with multiple biological data.


RecRWR: a recursive random walk method for improved identification of diseases.

Arrais JP, Oliveira JL - Biomed Res Int (2015)

Comparison of the RWR method using PPI data and PPI enriched in biological terms.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4385608&req=5

fig1: Comparison of the RWR method using PPI data and PPI enriched in biological terms.
Mentions: Previous use of RWR on molecular biology typically concentrates on PPI networks. One would expect that including additional data would contribute to an improved overall result. Figure 1 presents a comparison of the relative frequency of the ranks for each of the analysed datasets, for two of the tested methods (RWR over only PPI data and RWR over the whole network) and for four levels of randomness. From analysis of this graph it is clear there is no improvement with including external annotations on the original PPI network. Indeed for original dataset, with random effect, there are no perceptible differences between the two methods. This statement is even sharper when we test progressive levels of randomness. For instance, when 20% of the genes on the dataset are random, 55% of the RWR over PPI ranks the disease in the top 3, while with the RWR over all data this frequency drops to 48%. For 60% randomness, 35% of the RWR over PPI ranks the disease in the top 5, while with the RWR over all data the frequency drops to 23%. These results were the primary motivation for the work presented in this paper, as they clearly show that the RWR method is not suitable for dealing with multiple biological data.

Bottom Line: An open problem for these methods is the ability to combine and take advantage of the wealth of biomedical data publicly available.We present a comprehensive validation that demonstrates the advantage of the proposed approach, Recursive Random Walk with Restarts (RecRWR).The obtained results outline the superiority of the proposed approach, RecRWR, in identifying disease candidates, especially with high levels of biological noise and benefiting from all data available.

View Article: PubMed Central - PubMed

Affiliation: Department of Informatics Engineering (DEI), Centre for Informatics and Systems of the University of Coimbra (CISUC), University of Coimbra, 3030-290 Coimbra, Portugal.

ABSTRACT
High-throughput methods such as next-generation sequencing or DNA microarrays lack precision, as they return hundreds of genes for a single disease profile. Several computational methods applied to physical interaction of protein networks have been successfully used in identification of the best disease candidates for each expression profile. An open problem for these methods is the ability to combine and take advantage of the wealth of biomedical data publicly available. We propose an enhanced method to improve selection of the best disease targets for a multilayer biomedical network that integrates PPI data annotated with stable knowledge from OMIM diseases and GO biological processes. We present a comprehensive validation that demonstrates the advantage of the proposed approach, Recursive Random Walk with Restarts (RecRWR). The obtained results outline the superiority of the proposed approach, RecRWR, in identifying disease candidates, especially with high levels of biological noise and benefiting from all data available.

Show MeSH
Related in: MedlinePlus