Limits...
Mining protein interactomes to improve their reliability and support the advancement of network medicine.

Alanis-Lobato G - Front Genet (2015)

Bottom Line: The protein networks that are currently available are incomplete and a significant percentage of their interactions are false positives.Since diseases are rarely caused by the malfunction of a single protein, having a more complete and reliable interactome is crucial in order to identify groups of inter-related proteins involved in disease etiology.In this article, an important number of network mining tools is reviewed, together with resources from which reliable protein interactomes can be constructed.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Biology, Institute of Molecular Biology, Johannes Gutenberg University of Mainz Mainz, Germany ; Integrative Systems Biology Lab, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology Thuwal, Saudi Arabia.

ABSTRACT
High-throughput detection of protein interactions has had a major impact in our understanding of the intricate molecular machinery underlying the living cell, and has permitted the construction of very large protein interactomes. The protein networks that are currently available are incomplete and a significant percentage of their interactions are false positives. Fortunately, the structural properties observed in good quality social or technological networks are also present in biological systems. This has encouraged the development of tools, to improve the reliability of protein networks and predict new interactions based merely on the topological characteristics of their components. Since diseases are rarely caused by the malfunction of a single protein, having a more complete and reliable interactome is crucial in order to identify groups of inter-related proteins involved in disease etiology. These system components can then be targeted with minimal collateral damage. In this article, an important number of network mining tools is reviewed, together with resources from which reliable protein interactomes can be constructed. In addition to the review, a few representative examples of how molecular and clinical data can be integrated to deepen our understanding of pathogenesis are discussed.

No MeSH data available.


Related in: MedlinePlus

(A) In protein interactomes, only a few proteins, known as hubs, have a high number of interactors (node degree) and the rest interact only with a small number of proteins (left). In addition, PINs are highly clustered (middle) and every protein is easily reachable from anywhere in the network (right), compared to graphs with the same number of randomly linked nodes. (B) The number of common neighbors (normalized by the maximum) and the gene ontology (GO) similarity (biological process or BP shown) of protein interactions is higher than for disconnected protein pairs in the observed network. (C) The goal of neighborhood-based link prediction and reliability assessment is to assign a likelihood score to the observed or potential interaction between two proteins x and y. The formulae for representative link predictors are listed and applied to the toy network on the left. Γ(x) is the set of neighbors of node x,  is the same set but including x and the local community links (LCL) are highlighted in red. (D) There is compelling evidence that complex networks, like PINs, lie on low-dimensional manifolds embedded in high-dimensional space. When protein networks are mapped to low dimensions, good candidate interactions lie in close proximity. (E) The link prediction performance of several of the topological techniques discussed in this review, measured by the minimum Area Under the Sparsification curve (AUS) amongst four networks (for details of these datasets, see Cannistraci et al., 2013b). Red bars correspond to methods proposed for networks in general and green bars to methods proposed for bio-networks. (F) High-quality PI resources, like STRING or HIPPIE, assign a confidence score to each of their reported interactions, based on the different evidence sources supporting them.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4585290&req=5

Figure 1: (A) In protein interactomes, only a few proteins, known as hubs, have a high number of interactors (node degree) and the rest interact only with a small number of proteins (left). In addition, PINs are highly clustered (middle) and every protein is easily reachable from anywhere in the network (right), compared to graphs with the same number of randomly linked nodes. (B) The number of common neighbors (normalized by the maximum) and the gene ontology (GO) similarity (biological process or BP shown) of protein interactions is higher than for disconnected protein pairs in the observed network. (C) The goal of neighborhood-based link prediction and reliability assessment is to assign a likelihood score to the observed or potential interaction between two proteins x and y. The formulae for representative link predictors are listed and applied to the toy network on the left. Γ(x) is the set of neighbors of node x, is the same set but including x and the local community links (LCL) are highlighted in red. (D) There is compelling evidence that complex networks, like PINs, lie on low-dimensional manifolds embedded in high-dimensional space. When protein networks are mapped to low dimensions, good candidate interactions lie in close proximity. (E) The link prediction performance of several of the topological techniques discussed in this review, measured by the minimum Area Under the Sparsification curve (AUS) amongst four networks (for details of these datasets, see Cannistraci et al., 2013b). Red bars correspond to methods proposed for networks in general and green bars to methods proposed for bio-networks. (F) High-quality PI resources, like STRING or HIPPIE, assign a confidence score to each of their reported interactions, based on the different evidence sources supporting them.

Mentions: Today we have access to vast Protein Interaction Networks (PINs) from different organisms, due to high-throughput experimental techniques that are often an improved variation of yeast-two-hybrid screenings, or of co-immunoprecipitation followed by mass spectrometry (Vidal et al., 2011). Nevertheless, these networks are incomplete and contain a significant number of false positive interactions (Kuchaiev et al., 2009). However, it is fortunate that their structural properties are not different from those observed in good quality social or technological networks (Albert and Barabási, 2002; Liu et al., 2011; Cannistraci et al., 2013a) (Figure 1A). These topological similarities have prompted the development of tools, based on node-connectivity properties, aimed at improving the reliability and completeness of complex networks (Cannistraci et al., 2013a).


Mining protein interactomes to improve their reliability and support the advancement of network medicine.

Alanis-Lobato G - Front Genet (2015)

(A) In protein interactomes, only a few proteins, known as hubs, have a high number of interactors (node degree) and the rest interact only with a small number of proteins (left). In addition, PINs are highly clustered (middle) and every protein is easily reachable from anywhere in the network (right), compared to graphs with the same number of randomly linked nodes. (B) The number of common neighbors (normalized by the maximum) and the gene ontology (GO) similarity (biological process or BP shown) of protein interactions is higher than for disconnected protein pairs in the observed network. (C) The goal of neighborhood-based link prediction and reliability assessment is to assign a likelihood score to the observed or potential interaction between two proteins x and y. The formulae for representative link predictors are listed and applied to the toy network on the left. Γ(x) is the set of neighbors of node x,  is the same set but including x and the local community links (LCL) are highlighted in red. (D) There is compelling evidence that complex networks, like PINs, lie on low-dimensional manifolds embedded in high-dimensional space. When protein networks are mapped to low dimensions, good candidate interactions lie in close proximity. (E) The link prediction performance of several of the topological techniques discussed in this review, measured by the minimum Area Under the Sparsification curve (AUS) amongst four networks (for details of these datasets, see Cannistraci et al., 2013b). Red bars correspond to methods proposed for networks in general and green bars to methods proposed for bio-networks. (F) High-quality PI resources, like STRING or HIPPIE, assign a confidence score to each of their reported interactions, based on the different evidence sources supporting them.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4585290&req=5

Figure 1: (A) In protein interactomes, only a few proteins, known as hubs, have a high number of interactors (node degree) and the rest interact only with a small number of proteins (left). In addition, PINs are highly clustered (middle) and every protein is easily reachable from anywhere in the network (right), compared to graphs with the same number of randomly linked nodes. (B) The number of common neighbors (normalized by the maximum) and the gene ontology (GO) similarity (biological process or BP shown) of protein interactions is higher than for disconnected protein pairs in the observed network. (C) The goal of neighborhood-based link prediction and reliability assessment is to assign a likelihood score to the observed or potential interaction between two proteins x and y. The formulae for representative link predictors are listed and applied to the toy network on the left. Γ(x) is the set of neighbors of node x, is the same set but including x and the local community links (LCL) are highlighted in red. (D) There is compelling evidence that complex networks, like PINs, lie on low-dimensional manifolds embedded in high-dimensional space. When protein networks are mapped to low dimensions, good candidate interactions lie in close proximity. (E) The link prediction performance of several of the topological techniques discussed in this review, measured by the minimum Area Under the Sparsification curve (AUS) amongst four networks (for details of these datasets, see Cannistraci et al., 2013b). Red bars correspond to methods proposed for networks in general and green bars to methods proposed for bio-networks. (F) High-quality PI resources, like STRING or HIPPIE, assign a confidence score to each of their reported interactions, based on the different evidence sources supporting them.
Mentions: Today we have access to vast Protein Interaction Networks (PINs) from different organisms, due to high-throughput experimental techniques that are often an improved variation of yeast-two-hybrid screenings, or of co-immunoprecipitation followed by mass spectrometry (Vidal et al., 2011). Nevertheless, these networks are incomplete and contain a significant number of false positive interactions (Kuchaiev et al., 2009). However, it is fortunate that their structural properties are not different from those observed in good quality social or technological networks (Albert and Barabási, 2002; Liu et al., 2011; Cannistraci et al., 2013a) (Figure 1A). These topological similarities have prompted the development of tools, based on node-connectivity properties, aimed at improving the reliability and completeness of complex networks (Cannistraci et al., 2013a).

Bottom Line: The protein networks that are currently available are incomplete and a significant percentage of their interactions are false positives.Since diseases are rarely caused by the malfunction of a single protein, having a more complete and reliable interactome is crucial in order to identify groups of inter-related proteins involved in disease etiology.In this article, an important number of network mining tools is reviewed, together with resources from which reliable protein interactomes can be constructed.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Biology, Institute of Molecular Biology, Johannes Gutenberg University of Mainz Mainz, Germany ; Integrative Systems Biology Lab, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology Thuwal, Saudi Arabia.

ABSTRACT
High-throughput detection of protein interactions has had a major impact in our understanding of the intricate molecular machinery underlying the living cell, and has permitted the construction of very large protein interactomes. The protein networks that are currently available are incomplete and a significant percentage of their interactions are false positives. Fortunately, the structural properties observed in good quality social or technological networks are also present in biological systems. This has encouraged the development of tools, to improve the reliability of protein networks and predict new interactions based merely on the topological characteristics of their components. Since diseases are rarely caused by the malfunction of a single protein, having a more complete and reliable interactome is crucial in order to identify groups of inter-related proteins involved in disease etiology. These system components can then be targeted with minimal collateral damage. In this article, an important number of network mining tools is reviewed, together with resources from which reliable protein interactomes can be constructed. In addition to the review, a few representative examples of how molecular and clinical data can be integrated to deepen our understanding of pathogenesis are discussed.

No MeSH data available.


Related in: MedlinePlus