Limits...
Ensuring Confidentiality of Geocoded Health Data: Assessing Geographic Masking Strategies for Individual-Level Data.

Zandbergen PA - Adv Med (2014)

Bottom Line: One commonly used technique to protect confidentiality when releasing individual-level geocoded data is geographic masking.This typically consists of applying a certain amount of random perturbation in a systematic manner to reduce the risk of reidentification.Despite recent progress, no universally accepted or endorsed geographic masking technique has emerged.

View Article: PubMed Central - PubMed

Affiliation: Department of Geography, University of New Mexico, Albuquerque, NM 87131, USA.

ABSTRACT
Public health datasets increasingly use geographic identifiers such as an individual's address. Geocoding these addresses often provides new insights since it becomes possible to examine spatial patterns and associations. Address information is typically considered confidential and is therefore not released or shared with others. Publishing maps with the locations of individuals, however, may also breach confidentiality since addresses and associated identities can be discovered through reverse geocoding. One commonly used technique to protect confidentiality when releasing individual-level geocoded data is geographic masking. This typically consists of applying a certain amount of random perturbation in a systematic manner to reduce the risk of reidentification. A number of geographic masking techniques have been developed as well as methods to quantity the risk of reidentification associated with a particular masking method. This paper presents a review of the current state-of-the-art in geographic masking, summarizing the various methods and their strengths and weaknesses. Despite recent progress, no universally accepted or endorsed geographic masking technique has emerged. Researchers on the other hand are publishing maps using geographic masking of confidential locations. Any researcher publishing such maps is advised to become familiar with the different masking techniques available and their associated reidentification risks.

No MeSH data available.


Related in: MedlinePlus

Spatial aggregation of individual cases using census enumeration units. Individual geocoded locations (left) are aggregated using census tracts (right). The count of the number of cases per census tract is used to determine relevant population-weighted indices, such as the number of cases per 10,000 residents. Determining incidence or disease rates, as opposed to raw counts, is one of the primary reasons for aggregation. As a secondary benefit, spatial aggregation greatly reduced the reidentification risk.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4590956&req=5

fig3: Spatial aggregation of individual cases using census enumeration units. Individual geocoded locations (left) are aggregated using census tracts (right). The count of the number of cases per census tract is used to determine relevant population-weighted indices, such as the number of cases per 10,000 residents. Determining incidence or disease rates, as opposed to raw counts, is one of the primary reasons for aggregation. As a secondary benefit, spatial aggregation greatly reduced the reidentification risk.

Mentions: Another commonly used solution is to release the data in spatially aggregated form [9]. This is analogous to reporting summary data in tabular form for selected subsets of the original data. For individual-level geocoded data, aggregation is typically accomplished by combining individual locations within a meaningful spatial unit. This could consist of local or regional jurisdictions, such as cities, counties, or census enumeration units. Figure 3 illustrates the basic process for spatial aggregation. To preserve confidentiality, only the aggregated dataset is published or shared.


Ensuring Confidentiality of Geocoded Health Data: Assessing Geographic Masking Strategies for Individual-Level Data.

Zandbergen PA - Adv Med (2014)

Spatial aggregation of individual cases using census enumeration units. Individual geocoded locations (left) are aggregated using census tracts (right). The count of the number of cases per census tract is used to determine relevant population-weighted indices, such as the number of cases per 10,000 residents. Determining incidence or disease rates, as opposed to raw counts, is one of the primary reasons for aggregation. As a secondary benefit, spatial aggregation greatly reduced the reidentification risk.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4590956&req=5

fig3: Spatial aggregation of individual cases using census enumeration units. Individual geocoded locations (left) are aggregated using census tracts (right). The count of the number of cases per census tract is used to determine relevant population-weighted indices, such as the number of cases per 10,000 residents. Determining incidence or disease rates, as opposed to raw counts, is one of the primary reasons for aggregation. As a secondary benefit, spatial aggregation greatly reduced the reidentification risk.
Mentions: Another commonly used solution is to release the data in spatially aggregated form [9]. This is analogous to reporting summary data in tabular form for selected subsets of the original data. For individual-level geocoded data, aggregation is typically accomplished by combining individual locations within a meaningful spatial unit. This could consist of local or regional jurisdictions, such as cities, counties, or census enumeration units. Figure 3 illustrates the basic process for spatial aggregation. To preserve confidentiality, only the aggregated dataset is published or shared.

Bottom Line: One commonly used technique to protect confidentiality when releasing individual-level geocoded data is geographic masking.This typically consists of applying a certain amount of random perturbation in a systematic manner to reduce the risk of reidentification.Despite recent progress, no universally accepted or endorsed geographic masking technique has emerged.

View Article: PubMed Central - PubMed

Affiliation: Department of Geography, University of New Mexico, Albuquerque, NM 87131, USA.

ABSTRACT
Public health datasets increasingly use geographic identifiers such as an individual's address. Geocoding these addresses often provides new insights since it becomes possible to examine spatial patterns and associations. Address information is typically considered confidential and is therefore not released or shared with others. Publishing maps with the locations of individuals, however, may also breach confidentiality since addresses and associated identities can be discovered through reverse geocoding. One commonly used technique to protect confidentiality when releasing individual-level geocoded data is geographic masking. This typically consists of applying a certain amount of random perturbation in a systematic manner to reduce the risk of reidentification. A number of geographic masking techniques have been developed as well as methods to quantity the risk of reidentification associated with a particular masking method. This paper presents a review of the current state-of-the-art in geographic masking, summarizing the various methods and their strengths and weaknesses. Despite recent progress, no universally accepted or endorsed geographic masking technique has emerged. Researchers on the other hand are publishing maps using geographic masking of confidential locations. Any researcher publishing such maps is advised to become familiar with the different masking techniques available and their associated reidentification risks.

No MeSH data available.


Related in: MedlinePlus