Limits...
A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003.

Wheeler DC - Int J Health Geogr (2007)

Bottom Line: Numerous studies in the literature have focused on childhood leukemia because of its relatively large incidence among children compared with other malignant diseases and substantial public concern over elevated leukemia incidence.We found some evidence, although inconclusive, of significant local clusters in childhood leukemia in Ohio, but no significant overall clustering.The findings are consistent for the different tests of global clustering, where no significant clustering is demonstrated with any of the techniques when all age cases are considered together.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biostatistics, Emory University, Atlanta, GA, USA. dcwheel@sph.emory.edu

ABSTRACT

Background: Spatial cluster detection is an important tool in cancer surveillance to identify areas of elevated risk and to generate hypotheses about cancer etiology. There are many cluster detection methods used in spatial epidemiology to investigate suspicious groupings of cancer occurrences in regional count data and case-control data, where controls are sampled from the at-risk population. Numerous studies in the literature have focused on childhood leukemia because of its relatively large incidence among children compared with other malignant diseases and substantial public concern over elevated leukemia incidence. The main focus of this paper is an analysis of the spatial distribution of leukemia incidence among children from 0 to 14 years of age in Ohio from 1996-2003 using individual case data from the Ohio Cancer Incidence Surveillance System (OCISS).Specifically, we explore whether there is statistically significant global clustering and if there are statistically significant local clusters of individual leukemia cases in Ohio using numerous published methods of spatial cluster detection, including spatial point process summary methods, a nearest neighbor method, and a local rate scanning method. We use the K function, Cuzick and Edward's method, and the kernel intensity function to test for significant global clustering and the kernel intensity function and Kulldorff's spatial scan statistic in SaTScan to test for significant local clusters.

Results: We found some evidence, although inconclusive, of significant local clusters in childhood leukemia in Ohio, but no significant overall clustering. The findings from the local cluster detection analyses are not consistent for the different cluster detection techniques, where the spatial scan method in SaTScan does not find statistically significant local clusters, while the kernel intensity function method suggests statistically significant clusters in areas of central, southern, and eastern Ohio. The findings are consistent for the different tests of global clustering, where no significant clustering is demonstrated with any of the techniques when all age cases are considered together.

Conclusion: This comparative study for childhood leukemia clustering and clusters in Ohio revealed several research issues in practical spatial cluster detection. Among them, flexibility in cluster shape detection should be an issue for consideration.

Show MeSH

Related in: MedlinePlus

K functions (solid) for cases and controls with confidence bands (dashed) and distance in meters.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1851703&req=5

Figure 2: K functions (solid) for cases and controls with confidence bands (dashed) and distance in meters.

Mentions: The K function is a method introduced by Ripley [41] for testing for general clustering in a point pattern. It measures how many events occur within a certain distance of other events. A simple formula for the K function is K(h) = (average number of events within distance h of a randomly chosen event)/(average number of events per unit area). Also see Diggle [42] and Waller and Gotway [19] for a detailed discussion of the K function. The K function uses a vector of distances h to calculate the function many times at a range of distances in the study area. One can calculate a transformation, (h), of the estimated K function (h) that, when plotted on the y axis as (h) - (h), aids in the visual inspection of the K function over a range of distances. Besag [43] recommended the transformation of (h) = [e(h)/π]1/2. The e is the edge-corrected K function estimate defined by Ripley [44] as , where the weight wij is the proportion of the circumference of the event-centered circle with radius dij that is within the study area and is the intensity estimate, equal to the number of events in the study area divided by the area of the study. The expected value under complete spatial randomness (CSR) of (h) - h is close to zero. The plot in the top of Figure 2 is of (h) - h for cases evaluated at a range of distances over the study area with 999 Monte Carlo simulations of CSR to create 95% confidence bands to assess significance of deviations in the transformed, estimated K function from CSR. Note that the data points in the study are projected to Universal Transverse Mercator (UTM), zone 17 coordinates with meters as the distance unit. The plot indicates that there is significant clustering at smaller distances (less than 100,000 meters) and generally insignificant clustering at intermediate and large distances, where the transformed K function falls within the confidence bands. While the finding of clustering in the cases may seem significant, it is not the complete story of this phenomenon. To see this, we must inspect the bottom plot in Figure 2, which is a plot of (h) - h for the controls. The figures shows a similar pattern for cases and controls, which indicates that the significant clustering at smaller distances for cases is due to clustering in the underlying population and not clustering in the cases above what is observed in the at-risk population. While a visual comparison of the K functions for cases and controls shows no clear differences between the two, a test of difference in K functions is needed to definitively answer the inquiry of potential clustering in childhood leukemia.


A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996-2003.

Wheeler DC - Int J Health Geogr (2007)

K functions (solid) for cases and controls with confidence bands (dashed) and distance in meters.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1851703&req=5

Figure 2: K functions (solid) for cases and controls with confidence bands (dashed) and distance in meters.
Mentions: The K function is a method introduced by Ripley [41] for testing for general clustering in a point pattern. It measures how many events occur within a certain distance of other events. A simple formula for the K function is K(h) = (average number of events within distance h of a randomly chosen event)/(average number of events per unit area). Also see Diggle [42] and Waller and Gotway [19] for a detailed discussion of the K function. The K function uses a vector of distances h to calculate the function many times at a range of distances in the study area. One can calculate a transformation, (h), of the estimated K function (h) that, when plotted on the y axis as (h) - (h), aids in the visual inspection of the K function over a range of distances. Besag [43] recommended the transformation of (h) = [e(h)/π]1/2. The e is the edge-corrected K function estimate defined by Ripley [44] as , where the weight wij is the proportion of the circumference of the event-centered circle with radius dij that is within the study area and is the intensity estimate, equal to the number of events in the study area divided by the area of the study. The expected value under complete spatial randomness (CSR) of (h) - h is close to zero. The plot in the top of Figure 2 is of (h) - h for cases evaluated at a range of distances over the study area with 999 Monte Carlo simulations of CSR to create 95% confidence bands to assess significance of deviations in the transformed, estimated K function from CSR. Note that the data points in the study are projected to Universal Transverse Mercator (UTM), zone 17 coordinates with meters as the distance unit. The plot indicates that there is significant clustering at smaller distances (less than 100,000 meters) and generally insignificant clustering at intermediate and large distances, where the transformed K function falls within the confidence bands. While the finding of clustering in the cases may seem significant, it is not the complete story of this phenomenon. To see this, we must inspect the bottom plot in Figure 2, which is a plot of (h) - h for the controls. The figures shows a similar pattern for cases and controls, which indicates that the significant clustering at smaller distances for cases is due to clustering in the underlying population and not clustering in the cases above what is observed in the at-risk population. While a visual comparison of the K functions for cases and controls shows no clear differences between the two, a test of difference in K functions is needed to definitively answer the inquiry of potential clustering in childhood leukemia.

Bottom Line: Numerous studies in the literature have focused on childhood leukemia because of its relatively large incidence among children compared with other malignant diseases and substantial public concern over elevated leukemia incidence.We found some evidence, although inconclusive, of significant local clusters in childhood leukemia in Ohio, but no significant overall clustering.The findings are consistent for the different tests of global clustering, where no significant clustering is demonstrated with any of the techniques when all age cases are considered together.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biostatistics, Emory University, Atlanta, GA, USA. dcwheel@sph.emory.edu

ABSTRACT

Background: Spatial cluster detection is an important tool in cancer surveillance to identify areas of elevated risk and to generate hypotheses about cancer etiology. There are many cluster detection methods used in spatial epidemiology to investigate suspicious groupings of cancer occurrences in regional count data and case-control data, where controls are sampled from the at-risk population. Numerous studies in the literature have focused on childhood leukemia because of its relatively large incidence among children compared with other malignant diseases and substantial public concern over elevated leukemia incidence. The main focus of this paper is an analysis of the spatial distribution of leukemia incidence among children from 0 to 14 years of age in Ohio from 1996-2003 using individual case data from the Ohio Cancer Incidence Surveillance System (OCISS).Specifically, we explore whether there is statistically significant global clustering and if there are statistically significant local clusters of individual leukemia cases in Ohio using numerous published methods of spatial cluster detection, including spatial point process summary methods, a nearest neighbor method, and a local rate scanning method. We use the K function, Cuzick and Edward's method, and the kernel intensity function to test for significant global clustering and the kernel intensity function and Kulldorff's spatial scan statistic in SaTScan to test for significant local clusters.

Results: We found some evidence, although inconclusive, of significant local clusters in childhood leukemia in Ohio, but no significant overall clustering. The findings from the local cluster detection analyses are not consistent for the different cluster detection techniques, where the spatial scan method in SaTScan does not find statistically significant local clusters, while the kernel intensity function method suggests statistically significant clusters in areas of central, southern, and eastern Ohio. The findings are consistent for the different tests of global clustering, where no significant clustering is demonstrated with any of the techniques when all age cases are considered together.

Conclusion: This comparative study for childhood leukemia clustering and clusters in Ohio revealed several research issues in practical spatial cluster detection. Among them, flexibility in cluster shape detection should be an issue for consideration.

Show MeSH
Related in: MedlinePlus