Limits...
Statistical Approaches for the Construction and Interpretation of Human Protein-Protein Interaction Network

View Article: PubMed Central - PubMed

ABSTRACT

The overall goal is to establish a reliable human protein-protein interaction network and develop computational tools to characterize a protein-protein interaction (PPI) network and the role of individual proteins in the context of the network topology and their expression status. A novel and unique feature of our approach is that we assigned confidence measure to each derived interacting pair and account for the confidence in our network analysis. We integrated experimental data to infer human PPI network. Our model treated the true interacting status (yes versus no) for any given pair of human proteins as a latent variable whose value was not observed. The experimental data were the manifestation of interacting status, which provided evidence as to the likelihood of the interaction. The confidence of interactions would depend on the strength and consistency of the evidence.

No MeSH data available.


The optimization of QN and QS for different ε. Red line and green line correspond to QN and QS separately.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5015007&req=5

fig2: The optimization of QN and QS for different ε. Red line and green line correspond to QN and QS separately.

Mentions: Firstly, the human PPI network was cut into subnetworks or modules by SCAN. SCAN obtained modules based on the similarity between common neighbors. Then we used modularity and similarity-based modularity as metrics. Modularity is a statistical measure of the quality of network clustering, which is defined as follows:(11)QN=∑s=1NClsL−ds2L2,where NC is the number of clusterings, L is the number of edges, ls is the number of edges for sth module, and ds is the degree of all the nodes in sth module. We could obtain the best clustering by optimizing QN. And similarity-based modularity is the supplementary for the modularity, which is defined as follows:(12)QS=∑s=1NCISiTS−DSi2TS2.As shown in Figure 2, on one hand, the modularity monotonically decreased from the position nearby zero, and it could not be maximized. On the other hand, the similarity-based modularity could be maximized while the threshold ε equals 0.61. Conditional on the ε = 0.61, the reliable human PPI network was cut into 241 modules. Under the significant level α = 0.05, the p value of each module was calculated by the formula below:(13)p-value=∑i=mnMiN−Mn−iNn,where N is the number of all the proteins and M is the number of all the IDPs. 33 modules among 241 modules were significantly associated with IDPs.


Statistical Approaches for the Construction and Interpretation of Human Protein-Protein Interaction Network
The optimization of QN and QS for different ε. Red line and green line correspond to QN and QS separately.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5015007&req=5

fig2: The optimization of QN and QS for different ε. Red line and green line correspond to QN and QS separately.
Mentions: Firstly, the human PPI network was cut into subnetworks or modules by SCAN. SCAN obtained modules based on the similarity between common neighbors. Then we used modularity and similarity-based modularity as metrics. Modularity is a statistical measure of the quality of network clustering, which is defined as follows:(11)QN=∑s=1NClsL−ds2L2,where NC is the number of clusterings, L is the number of edges, ls is the number of edges for sth module, and ds is the degree of all the nodes in sth module. We could obtain the best clustering by optimizing QN. And similarity-based modularity is the supplementary for the modularity, which is defined as follows:(12)QS=∑s=1NCISiTS−DSi2TS2.As shown in Figure 2, on one hand, the modularity monotonically decreased from the position nearby zero, and it could not be maximized. On the other hand, the similarity-based modularity could be maximized while the threshold ε equals 0.61. Conditional on the ε = 0.61, the reliable human PPI network was cut into 241 modules. Under the significant level α = 0.05, the p value of each module was calculated by the formula below:(13)p-value=∑i=mnMiN−Mn−iNn,where N is the number of all the proteins and M is the number of all the IDPs. 33 modules among 241 modules were significantly associated with IDPs.

View Article: PubMed Central - PubMed

ABSTRACT

The overall goal is to establish a reliable human protein-protein interaction network and develop computational tools to characterize a protein-protein interaction (PPI) network and the role of individual proteins in the context of the network topology and their expression status. A novel and unique feature of our approach is that we assigned confidence measure to each derived interacting pair and account for the confidence in our network analysis. We integrated experimental data to infer human PPI network. Our model treated the true interacting status (yes versus no) for any given pair of human proteins as a latent variable whose value was not observed. The experimental data were the manifestation of interacting status, which provided evidence as to the likelihood of the interaction. The confidence of interactions would depend on the strength and consistency of the evidence.

No MeSH data available.