Limits...
Network analysis of human protein location.

Kumar G, Ranganathan S - BMC Bioinformatics (2010)

Bottom Line: PLCP analysis revealed that protein interactions are majorly restricted to the same SCL, though significant cross-compartment interactions are seen for nuclear proteins.The MLPI network differs significantly from the PPI network in its SCL distribution.The PPI network shows passive protein interaction, possibly due to its high false positive rate, across different subcellular compartments, which seem to be absent in the MLPI network, as the MLPI network has evolved to maintain high substrate specificity for proteins.

View Article: PubMed Central - HTML - PubMed

Affiliation: ARC Centre of Excellence in Bioinformatics and Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney NSW, Australia. gaurav.kumar@mq.edu.au

ABSTRACT

Background: Understanding cellular systems requires the knowledge of a protein's subcellular localization (SCL). Although experimental and predicted data for protein SCL are archived in various databases, SCL prediction remains a non-trivial problem in genome annotation. Current SCL prediction tools use amino-acid sequence features and text mining approaches. A comprehensive analysis of protein SCL in human PPI and metabolic networks for various subcellular compartments is necessary for developing a robust SCL prediction methodology.

Results: Based on protein-protein interaction (PPI) and metabolite-linked protein interaction (MLPI) networks of proteins, we have compared, contrasted and analysed the statistical properties across different subcellular compartments. We integrated PPI and metabolic datasets with SCL information of human proteins from LOCATE and GOA (Gene Ontology Annotation) and estimated three statistical properties: Chi-square (χ2) test, Paired Localisation Correlation Profile (PLCP) and network topological measures. For the PPI network, Pearson's chi-square test shows that for the same SCL category, twice as many interacting protein pairs are observed than estimated when compared to non-interacting protein pairs (χ2 = 1270.19, P-value < 2.2 × 10(-16)), whereas for MLPI, metabolite-linked protein pairs having the same SCL are observed 20% more than expected, compared to non-metabolite linked proteins (χ2 = 110.02, P-value < 2.2 x10(-16)). To address the issue of proteins with multiple SCLs, we have specifically used the PLCP (Pair Localization Correlation Profile) measure. PLCP analysis revealed that protein interactions are majorly restricted to the same SCL, though significant cross-compartment interactions are seen for nuclear proteins. Metabolite-linked protein pairs are restricted to specific compartments such as the mitochondrion (P-value < 6.0e-07), the lysosome (P-value < 4.7e-05) and the Golgi apparatus (P-value < 1.0e-15). These findings indicate that the metabolic network adds value to the information in the PPI network for the localisation process of proteins in human subcellular compartments.

Conclusions: The MLPI network differs significantly from the PPI network in its SCL distribution. The PPI network shows passive protein interaction, possibly due to its high false positive rate, across different subcellular compartments, which seem to be absent in the MLPI network, as the MLPI network has evolved to maintain high substrate specificity for proteins.

Show MeSH

Related in: MedlinePlus

Z-score correlation profile. The Z-score correlation for LOCATE and GOA SCLs in the major subcellular compartments (see Additional file 1 for details) for the physically interacting and metabolite-linked protein pairs. A and B are LOCATE SCL correlation profiles, whereas C and D are GOA correlation profiles. Refer to Additional file 2 for Z-score values.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2957692&req=5

Figure 4: Z-score correlation profile. The Z-score correlation for LOCATE and GOA SCLs in the major subcellular compartments (see Additional file 1 for details) for the physically interacting and metabolite-linked protein pairs. A and B are LOCATE SCL correlation profiles, whereas C and D are GOA correlation profiles. Refer to Additional file 2 for Z-score values.

Mentions: We further tested the hypothesis of whether the network of interacting protein pairs is different from a random network, by calculating the Z-score between the given compartments (described in the Methods section). The random network was simulated by rewiring the network such that the degree associated with each node in the real network remains the same [14]. The P-value can then be obtained by comparing the Z-score to a standard normal distribution. Comparing with a "properly" randomized network ensemble (1000 in our case) allows us to concentrate on those statistically significant localisation patterns of these complex interaction networks that are likely to reflect the conserved interaction pairs across different subcellular compartments. The statistical significance of correlation profiles were calculated for PPI and metabolic networks for each paired compartments. The Z-score profile scales differently for the physically interacting and metabolite-linked protein pairs (Figure 4). The PPI network Z-score (Figures 4A, C) suggest that compared to random networks, the number of interacting protein pairs co-locating in the same compartment is significant for EC (P-value < 9.8 e-10), MC (P-value < 3.7 e-05), LS (P-value < 4.5 e-12), ES (P-value < 1.8 e-09) and CV (P-value < 1.9 e-35) for the LOCATE dataset (Figure 4A and Additional file 2). We also observed a significant correlation for CV proteins to interact with EC proteins (P-value < 5.4 e-06) but not otherwise i.e. EC proteins do not interact with CV proteins at a significant P-value < 0.01. Similarly, TJ proteins are more likely to interact with that of the PM proteins (P-value < 4.3e-05), whereas the likelihood of PM proteins to interact with TJ proteins is less significant (P-value ~ 0.01). GOA SCL assignment (Figures 4C) suggests that statistically significant protein pair interactions occur within TJ (P-value ~ 0) and EC (P-value < 1.36e-07). Proteins pairs within the ES compartment seems to have a weak interaction (P-value ~ 0.0007). Similar weak interactions have been noticed between the proteins in the ER compartment with those of the GA (P-value ~ 0.007) (Additional File 2).


Network analysis of human protein location.

Kumar G, Ranganathan S - BMC Bioinformatics (2010)

Z-score correlation profile. The Z-score correlation for LOCATE and GOA SCLs in the major subcellular compartments (see Additional file 1 for details) for the physically interacting and metabolite-linked protein pairs. A and B are LOCATE SCL correlation profiles, whereas C and D are GOA correlation profiles. Refer to Additional file 2 for Z-score values.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2957692&req=5

Figure 4: Z-score correlation profile. The Z-score correlation for LOCATE and GOA SCLs in the major subcellular compartments (see Additional file 1 for details) for the physically interacting and metabolite-linked protein pairs. A and B are LOCATE SCL correlation profiles, whereas C and D are GOA correlation profiles. Refer to Additional file 2 for Z-score values.
Mentions: We further tested the hypothesis of whether the network of interacting protein pairs is different from a random network, by calculating the Z-score between the given compartments (described in the Methods section). The random network was simulated by rewiring the network such that the degree associated with each node in the real network remains the same [14]. The P-value can then be obtained by comparing the Z-score to a standard normal distribution. Comparing with a "properly" randomized network ensemble (1000 in our case) allows us to concentrate on those statistically significant localisation patterns of these complex interaction networks that are likely to reflect the conserved interaction pairs across different subcellular compartments. The statistical significance of correlation profiles were calculated for PPI and metabolic networks for each paired compartments. The Z-score profile scales differently for the physically interacting and metabolite-linked protein pairs (Figure 4). The PPI network Z-score (Figures 4A, C) suggest that compared to random networks, the number of interacting protein pairs co-locating in the same compartment is significant for EC (P-value < 9.8 e-10), MC (P-value < 3.7 e-05), LS (P-value < 4.5 e-12), ES (P-value < 1.8 e-09) and CV (P-value < 1.9 e-35) for the LOCATE dataset (Figure 4A and Additional file 2). We also observed a significant correlation for CV proteins to interact with EC proteins (P-value < 5.4 e-06) but not otherwise i.e. EC proteins do not interact with CV proteins at a significant P-value < 0.01. Similarly, TJ proteins are more likely to interact with that of the PM proteins (P-value < 4.3e-05), whereas the likelihood of PM proteins to interact with TJ proteins is less significant (P-value ~ 0.01). GOA SCL assignment (Figures 4C) suggests that statistically significant protein pair interactions occur within TJ (P-value ~ 0) and EC (P-value < 1.36e-07). Proteins pairs within the ES compartment seems to have a weak interaction (P-value ~ 0.0007). Similar weak interactions have been noticed between the proteins in the ER compartment with those of the GA (P-value ~ 0.007) (Additional File 2).

Bottom Line: PLCP analysis revealed that protein interactions are majorly restricted to the same SCL, though significant cross-compartment interactions are seen for nuclear proteins.The MLPI network differs significantly from the PPI network in its SCL distribution.The PPI network shows passive protein interaction, possibly due to its high false positive rate, across different subcellular compartments, which seem to be absent in the MLPI network, as the MLPI network has evolved to maintain high substrate specificity for proteins.

View Article: PubMed Central - HTML - PubMed

Affiliation: ARC Centre of Excellence in Bioinformatics and Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney NSW, Australia. gaurav.kumar@mq.edu.au

ABSTRACT

Background: Understanding cellular systems requires the knowledge of a protein's subcellular localization (SCL). Although experimental and predicted data for protein SCL are archived in various databases, SCL prediction remains a non-trivial problem in genome annotation. Current SCL prediction tools use amino-acid sequence features and text mining approaches. A comprehensive analysis of protein SCL in human PPI and metabolic networks for various subcellular compartments is necessary for developing a robust SCL prediction methodology.

Results: Based on protein-protein interaction (PPI) and metabolite-linked protein interaction (MLPI) networks of proteins, we have compared, contrasted and analysed the statistical properties across different subcellular compartments. We integrated PPI and metabolic datasets with SCL information of human proteins from LOCATE and GOA (Gene Ontology Annotation) and estimated three statistical properties: Chi-square (χ2) test, Paired Localisation Correlation Profile (PLCP) and network topological measures. For the PPI network, Pearson's chi-square test shows that for the same SCL category, twice as many interacting protein pairs are observed than estimated when compared to non-interacting protein pairs (χ2 = 1270.19, P-value < 2.2 × 10(-16)), whereas for MLPI, metabolite-linked protein pairs having the same SCL are observed 20% more than expected, compared to non-metabolite linked proteins (χ2 = 110.02, P-value < 2.2 x10(-16)). To address the issue of proteins with multiple SCLs, we have specifically used the PLCP (Pair Localization Correlation Profile) measure. PLCP analysis revealed that protein interactions are majorly restricted to the same SCL, though significant cross-compartment interactions are seen for nuclear proteins. Metabolite-linked protein pairs are restricted to specific compartments such as the mitochondrion (P-value < 6.0e-07), the lysosome (P-value < 4.7e-05) and the Golgi apparatus (P-value < 1.0e-15). These findings indicate that the metabolic network adds value to the information in the PPI network for the localisation process of proteins in human subcellular compartments.

Conclusions: The MLPI network differs significantly from the PPI network in its SCL distribution. The PPI network shows passive protein interaction, possibly due to its high false positive rate, across different subcellular compartments, which seem to be absent in the MLPI network, as the MLPI network has evolved to maintain high substrate specificity for proteins.

Show MeSH
Related in: MedlinePlus