Limits...
Flexible Two-Phase studies for rare exposures: Feasibility, planning and efficiency issues of a new variant.

Wild P, Andrieu N, Goldstein AM, Schill W - Epidemiol Perspect Innov (2008)

Bottom Line: Two-phase studies have been shown to be efficient compared to standard case-control designs.The design is applied to two examples from occupational and genetic epidemiology.By ensuring minimum numbers for the rarest disease-covariate combination(s), we obtain considerable efficiency gains over standard two-phase studies with an improved practical feasibility.

View Article: PubMed Central - HTML - PubMed

Affiliation: INRS, French National Institute for Research and Safety, Department of Epidemiology, France. pascal.wild@inrs.fr

ABSTRACT
The two-phase design consists of an initial (Phase One) study with known disease status and inexpensive covariate information. Within this initial study one selects a subsample on which to collect detailed covariate data. Two-phase studies have been shown to be efficient compared to standard case-control designs. However, potential problems arise if one cannot assure minimum sample sizes in the rarest categories or if recontact of subjects is difficult. In the case of a rare exposure with an inexpensive proxy, the authors propose the flexible two-phase design for which there is a single time of contact, at which a decision about full covariate ascertainment is made based on the proxy. Subjects are screened until the desired numbers of cases and controls have been selected for full data collection. Strategies for optimizing the cost/efficiency of this design and corresponding software are presented. The design is applied to two examples from occupational and genetic epidemiology. By ensuring minimum numbers for the rarest disease-covariate combination(s), we obtain considerable efficiency gains over standard two-phase studies with an improved practical feasibility. The flexible two-phase design may be the design of choice in the case of well targeted studies of the effect of rare exposures with an inexpensive proxy.

No MeSH data available.


Related in: MedlinePlus

STATA output for Example 1.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2602593&req=5

Figure 1: STATA output for Example 1.

Mentions: Figure 1 shows the STATA output of the analysis of the expected frequencies. The STATA program for this analysis is included as an additional file (figure 1.do [see Additional file 3] using the STATA data file MWF.dta [see Additional file 4] obtained by applying the computations shown in Appendix 2). In this example d, z, X, Nij, nijk, respectively denote, the case status (1 = case, 0 = control), the stratum indicator, the metal fluid indicator (X = 1 exposed, X = 0 unexposed), the stratum-wise numbers in Phase One, and the Phase Two numbers by stratum and exposure to metal fluids. The power is computed using a bilateral Wald test at a 5% level using the following formula: Power = Φ(βx/se(βx)-1.96) = 80.2% where Φ denotes the cumulative standard normal distribution, βx the log-odds ratio and se(βx) its standard error. The asymptotic standard error se(βx) is 0.247 for the log-odds ratio and βx = ln(2) = 0.693, as the assumed OR is equal to 2. In contrast, a standard case-control study, in which 200 controls and 105 cases were randomly selected, would yield a se(βx) = 0.400, corresponding to 40.9% power using the same formula.


Flexible Two-Phase studies for rare exposures: Feasibility, planning and efficiency issues of a new variant.

Wild P, Andrieu N, Goldstein AM, Schill W - Epidemiol Perspect Innov (2008)

STATA output for Example 1.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2602593&req=5

Figure 1: STATA output for Example 1.
Mentions: Figure 1 shows the STATA output of the analysis of the expected frequencies. The STATA program for this analysis is included as an additional file (figure 1.do [see Additional file 3] using the STATA data file MWF.dta [see Additional file 4] obtained by applying the computations shown in Appendix 2). In this example d, z, X, Nij, nijk, respectively denote, the case status (1 = case, 0 = control), the stratum indicator, the metal fluid indicator (X = 1 exposed, X = 0 unexposed), the stratum-wise numbers in Phase One, and the Phase Two numbers by stratum and exposure to metal fluids. The power is computed using a bilateral Wald test at a 5% level using the following formula: Power = Φ(βx/se(βx)-1.96) = 80.2% where Φ denotes the cumulative standard normal distribution, βx the log-odds ratio and se(βx) its standard error. The asymptotic standard error se(βx) is 0.247 for the log-odds ratio and βx = ln(2) = 0.693, as the assumed OR is equal to 2. In contrast, a standard case-control study, in which 200 controls and 105 cases were randomly selected, would yield a se(βx) = 0.400, corresponding to 40.9% power using the same formula.

Bottom Line: Two-phase studies have been shown to be efficient compared to standard case-control designs.The design is applied to two examples from occupational and genetic epidemiology.By ensuring minimum numbers for the rarest disease-covariate combination(s), we obtain considerable efficiency gains over standard two-phase studies with an improved practical feasibility.

View Article: PubMed Central - HTML - PubMed

Affiliation: INRS, French National Institute for Research and Safety, Department of Epidemiology, France. pascal.wild@inrs.fr

ABSTRACT
The two-phase design consists of an initial (Phase One) study with known disease status and inexpensive covariate information. Within this initial study one selects a subsample on which to collect detailed covariate data. Two-phase studies have been shown to be efficient compared to standard case-control designs. However, potential problems arise if one cannot assure minimum sample sizes in the rarest categories or if recontact of subjects is difficult. In the case of a rare exposure with an inexpensive proxy, the authors propose the flexible two-phase design for which there is a single time of contact, at which a decision about full covariate ascertainment is made based on the proxy. Subjects are screened until the desired numbers of cases and controls have been selected for full data collection. Strategies for optimizing the cost/efficiency of this design and corresponding software are presented. The design is applied to two examples from occupational and genetic epidemiology. By ensuring minimum numbers for the rarest disease-covariate combination(s), we obtain considerable efficiency gains over standard two-phase studies with an improved practical feasibility. The flexible two-phase design may be the design of choice in the case of well targeted studies of the effect of rare exposures with an inexpensive proxy.

No MeSH data available.


Related in: MedlinePlus