Limits...
Ambiguity in Social Network Data for Presence, Sensitive-Attribute, Degree and Relationship Privacy Protection.

Rajaei M, Haghjoo MS, Miyaneh EK - PLoS ONE (2015)

Bottom Line: Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value.We also show how to measure different privacy requirements in ASN.Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

ABSTRACT
Maintaining privacy in network data publishing is a major challenge. This is because known characteristics of individuals can be used to extract new information about them. Recently, researchers have developed privacy methods based on k-anonymity and l-diversity to prevent re-identification or sensitive label disclosure through certain structural information. However, most of these studies have considered only structural information and have been developed for undirected networks. Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value. In this paper, we introduce a framework for protecting sensitive attribute, degree (the number of connected entities), and relationships, as well as the presence of individuals in directed social network data whose nodes contain attributes. First, we define a privacy model that specifies privacy requirements for the above private information. Then, we introduce the technique of Ambiguity in Social Network data (ASN) based on anatomy, which specifies how to publish social network data. To employ ASN, individuals are partitioned into groups. Then, ASN publishes exact values of properties of individuals of each group with common group ID in several tables. The lossy join of those tables based on group ID injects uncertainty to reconstruct the original network. We also show how to measure different privacy requirements in ASN. Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

No MeSH data available.


Information loss and privacy of released social network data.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4481469&req=5

pone.0130693.g006: Information loss and privacy of released social network data.

Mentions: Let Q be a query, and Q(N) and Q(N*) be accurate and approximate results, applying Q to the original network data N and the anonymized network data N*, respectively, based on the proposed framework. The relative error is . This relative error is computed for each aggregate tabular query. Less relative error shows less information loss and more data utility. Fig 6(a) and 6(b) show minimum, average and maximum error of total generated random equality and range queries in both network datasets. In each kind of query, the error of the Random network is less than that of the URVEmail network. In addition, for each network dataset, the relative error of the equality query is more than that of the range query. We define selectivity of one query as the number of individuals that satisfy all its conditions. Since the selectivity of equality query is less than that of range query, the result for the equality query is less. So, based on the Error formula, a small change in small actual value causes more relative error compared to large actual value. Fig 6(a) and 6(b) also illustrate average relative error with respect to the number of properties involved in conditions of the queries. As shown, an increased number of involved properties increases relative error due to decrease in query selectivity.


Ambiguity in Social Network Data for Presence, Sensitive-Attribute, Degree and Relationship Privacy Protection.

Rajaei M, Haghjoo MS, Miyaneh EK - PLoS ONE (2015)

Information loss and privacy of released social network data.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4481469&req=5

pone.0130693.g006: Information loss and privacy of released social network data.
Mentions: Let Q be a query, and Q(N) and Q(N*) be accurate and approximate results, applying Q to the original network data N and the anonymized network data N*, respectively, based on the proposed framework. The relative error is . This relative error is computed for each aggregate tabular query. Less relative error shows less information loss and more data utility. Fig 6(a) and 6(b) show minimum, average and maximum error of total generated random equality and range queries in both network datasets. In each kind of query, the error of the Random network is less than that of the URVEmail network. In addition, for each network dataset, the relative error of the equality query is more than that of the range query. We define selectivity of one query as the number of individuals that satisfy all its conditions. Since the selectivity of equality query is less than that of range query, the result for the equality query is less. So, based on the Error formula, a small change in small actual value causes more relative error compared to large actual value. Fig 6(a) and 6(b) also illustrate average relative error with respect to the number of properties involved in conditions of the queries. As shown, an increased number of involved properties increases relative error due to decrease in query selectivity.

Bottom Line: Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value.We also show how to measure different privacy requirements in ASN.Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

ABSTRACT
Maintaining privacy in network data publishing is a major challenge. This is because known characteristics of individuals can be used to extract new information about them. Recently, researchers have developed privacy methods based on k-anonymity and l-diversity to prevent re-identification or sensitive label disclosure through certain structural information. However, most of these studies have considered only structural information and have been developed for undirected networks. Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value. In this paper, we introduce a framework for protecting sensitive attribute, degree (the number of connected entities), and relationships, as well as the presence of individuals in directed social network data whose nodes contain attributes. First, we define a privacy model that specifies privacy requirements for the above private information. Then, we introduce the technique of Ambiguity in Social Network data (ASN) based on anatomy, which specifies how to publish social network data. To employ ASN, individuals are partitioned into groups. Then, ASN publishes exact values of properties of individuals of each group with common group ID in several tables. The lossy join of those tables based on group ID injects uncertainty to reconstruct the original network. We also show how to measure different privacy requirements in ASN. Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.

No MeSH data available.