Limits...
Ambiguity in Social Network Data for Presence, Sensitive-Attribute, Degree and Relationship Privacy Protection.

Rajaei M, Haghjoo MS, Miyaneh EK - PLoS ONE (2015)

Bottom Line: Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value.We also show how to measure different privacy requirements in ASN.Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

ABSTRACT
Maintaining privacy in network data publishing is a major challenge. This is because known characteristics of individuals can be used to extract new information about them. Recently, researchers have developed privacy methods based on k-anonymity and l-diversity to prevent re-identification or sensitive label disclosure through certain structural information. However, most of these studies have considered only structural information and have been developed for undirected networks. Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value. In this paper, we introduce a framework for protecting sensitive attribute, degree (the number of connected entities), and relationships, as well as the presence of individuals in directed social network data whose nodes contain attributes. First, we define a privacy model that specifies privacy requirements for the above private information. Then, we introduce the technique of Ambiguity in Social Network data (ASN) based on anatomy, which specifies how to publish social network data. To employ ASN, individuals are partitioned into groups. Then, ASN publishes exact values of properties of individuals of each group with common group ID in several tables. The lossy join of those tables based on group ID injects uncertainty to reconstruct the original network. We also show how to measure different privacy requirements in ASN. Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

No MeSH data available.


Sample graphs of anonymized network of Fig 2.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4481469&req=5

pone.0130693.g005: Sample graphs of anonymized network of Fig 2.

Mentions: Graph topological properties: The anonymized network data produced by ASN is analyzed in the same way as network generalization methods [6,4], whereby sample graphs are generated from the published data. So, we randomly generate 100 sample graphs based on released tables (DT and SVT). As mentioned before, graphs could be reconstructed. One member of valid edges choices set of each group is generated randomly to reconstruct each sample graph. These graphs show structural information of the network. We run topological queries on these graphs, and compute the average of each query response for all sample graphs. Fig 5 shows two sample graphs of the anonymized network from Fig 2. We compute information loss of the following measures for reconstructed graphs to evaluate preserved data utility of the proposed framework: diameter, shortest path length, clustering coefficient, closeness, betweenness, size of maximum strongly and weakly connected components.


Ambiguity in Social Network Data for Presence, Sensitive-Attribute, Degree and Relationship Privacy Protection.

Rajaei M, Haghjoo MS, Miyaneh EK - PLoS ONE (2015)

Sample graphs of anonymized network of Fig 2.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4481469&req=5

pone.0130693.g005: Sample graphs of anonymized network of Fig 2.
Mentions: Graph topological properties: The anonymized network data produced by ASN is analyzed in the same way as network generalization methods [6,4], whereby sample graphs are generated from the published data. So, we randomly generate 100 sample graphs based on released tables (DT and SVT). As mentioned before, graphs could be reconstructed. One member of valid edges choices set of each group is generated randomly to reconstruct each sample graph. These graphs show structural information of the network. We run topological queries on these graphs, and compute the average of each query response for all sample graphs. Fig 5 shows two sample graphs of the anonymized network from Fig 2. We compute information loss of the following measures for reconstructed graphs to evaluate preserved data utility of the proposed framework: diameter, shortest path length, clustering coefficient, closeness, betweenness, size of maximum strongly and weakly connected components.

Bottom Line: Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value.We also show how to measure different privacy requirements in ASN.Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

ABSTRACT
Maintaining privacy in network data publishing is a major challenge. This is because known characteristics of individuals can be used to extract new information about them. Recently, researchers have developed privacy methods based on k-anonymity and l-diversity to prevent re-identification or sensitive label disclosure through certain structural information. However, most of these studies have considered only structural information and have been developed for undirected networks. Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value. In this paper, we introduce a framework for protecting sensitive attribute, degree (the number of connected entities), and relationships, as well as the presence of individuals in directed social network data whose nodes contain attributes. First, we define a privacy model that specifies privacy requirements for the above private information. Then, we introduce the technique of Ambiguity in Social Network data (ASN) based on anatomy, which specifies how to publish social network data. To employ ASN, individuals are partitioned into groups. Then, ASN publishes exact values of properties of individuals of each group with common group ID in several tables. The lossy join of those tables based on group ID injects uncertainty to reconstruct the original network. We also show how to measure different privacy requirements in ASN. Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

No MeSH data available.