Limits...
Detecting new neurodegenerative disease genes: does phenotype accuracy limit the horizon?

Samuels DC, Burn DJ, Chinnery PF - Trends Genet. (2009)

View Article: PubMed Central - PubMed

AUTOMATICALLY GENERATED EXCERPT
Please rate it.

Diagnostic revision also occurs in ∼1/3 of cases... Phenotypic misclassification reduces the power to detect a statistical association between a phenotype and specific allele for a given sample size... In silico modelling has shown that increasing the sample size counterbalances diagnostic error, but that the relationship between statistical power and diagnostic accuracy is not linear; in addition, the sample size required to generate reasonable power increases dramatically with reduced diagnostic accuracy... For strong genetic effects, the precise diagnosis might not be a key issue... However, >2% diagnostic error has a dramatic effect on power, especially when attention is drawn to lower-penetrance alleles (i.e. GRR ≤1.1), as proposed for many complex traits... Thus, to achieve the same effect, investigators could either improve the phenotypic accuracy and remove false-positive cases from an existing cohort, or they could inflate the number of cases by up to 400-fold to compensate for the diagnostic errors of up to 20% (Figure 1d)... We argue that it is more cost effective to improve phenotypic accuracy than it is to increase the sample size... For example, even when considering alleles with a modest effect (GRR = 1.3), increasing diagnostic accuracy from 90% to 95% would reduce the number of affected individuals needed by threefold while maintaining the same power... For biologically plausible risk alleles with a minor frequency of 10% conferring a GRR of 1.3, increasing diagnostic accuracy by 10% would mean genotyping ∼8000 rather than ∼750 000 cases... For relatively uncommon neurodegenerative diseases, such as ALS (which has a prevalence ∼1 in 20 000) and PSP (affecting ∼5 in 100 000), it might never be possible to assemble cohorts with >100 000 cases from a genetically homogeneous population; studies of uncommon alleles with modest effects will only be possible with an exceptionally high diagnostic accuracy, placing greater emphasis on autopsy-based series... Providing the disease is rare (<10% of the population), the age-related penetrance is not a major concern... Now that GWAS has helped to identify the ‘low hanging fruit’ in complex disease (i.e. common alleles with strong genetic effects), the emphasis shifts to the detection of the ∼20–100 low penetrance disease-specific variants thought to underpin most common complex traits, some of which might contribute to interindividual phenotypic variability.

Show MeSH
Power to detect a genetic association in the context of diagnostic errors. In each example, the probability of affected individuals being classified as controls is 1 × 10−5. Varying this parameter has negligible impact on power and/or optimal sample size for diseases that are present in <10% of the population [10]. (a) Power to detect an association between a common allele (allele frequency = 0.5; GRR = 1.1– 1.3 under a multiplicative model) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. Disease frequency = 0.01. (b) Power to detect an association between alleles of different frequency (0.5, 0.25, 0.1) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. GRR = 1.3, disease frequency = 0.01. (c) Power to detect an association between an allele (frequency = 0.125, GRR = 1.3) and diseases of different prevalence (0.01, 0.001, 0.0001) in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. (d) Ratio of the number of inaccurately phenotyped cases (nerror) to the number of accurately phenotyped cases (nnoerror) required to detect an association between an allele (frequency = 0.1, varying GRR from 1.1 to 1.3) and a disease (frequency = 0.01) with 95% power at varying degrees of diagnostic error at P < 5 × 10−7. All calculations used PAWE-PH Phenotype Edition [10].
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2824109&req=5

fig1: Power to detect a genetic association in the context of diagnostic errors. In each example, the probability of affected individuals being classified as controls is 1 × 10−5. Varying this parameter has negligible impact on power and/or optimal sample size for diseases that are present in <10% of the population [10]. (a) Power to detect an association between a common allele (allele frequency = 0.5; GRR = 1.1– 1.3 under a multiplicative model) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. Disease frequency = 0.01. (b) Power to detect an association between alleles of different frequency (0.5, 0.25, 0.1) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. GRR = 1.3, disease frequency = 0.01. (c) Power to detect an association between an allele (frequency = 0.125, GRR = 1.3) and diseases of different prevalence (0.01, 0.001, 0.0001) in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. (d) Ratio of the number of inaccurately phenotyped cases (nerror) to the number of accurately phenotyped cases (nnoerror) required to detect an association between an allele (frequency = 0.1, varying GRR from 1.1 to 1.3) and a disease (frequency = 0.01) with 95% power at varying degrees of diagnostic error at P < 5 × 10−7. All calculations used PAWE-PH Phenotype Edition [10].

Mentions: There have been several approaches to try and deal with the issue of diagnostic inaccuracy in neurodegenerative diseases (Box 1). In silico modelling has shown that increasing the sample size counterbalances diagnostic error [10], but that the relationship between statistical power and diagnostic accuracy is not linear; in addition, the sample size required to generate reasonable power increases dramatically with reduced diagnostic accuracy [10]. For strong genetic effects, the precise diagnosis might not be a key issue. For example, even when 15% of cases are incorrectly classified as Alzheimer disease, a study of 500 cases and 500 controls would have >70% power to detect the well-established association with the ɛ4 APOE allele (Online Supplementary Material Figure S1). However, the detection of hitherto unknown modest disease associations at the whole-genome level presents a greater challenge [11]. For common genetic variants exerting a modest effect [where the genome relative risk (GRR) is 1.3], a diagnostic error rate of ∼2% has little effect on statistical power (Figure 1a). However, >2% diagnostic error has a dramatic effect on power, especially when attention is drawn to lower-penetrance alleles (i.e. GRR ≤1.1), as proposed for many complex traits. This is further compounded when less frequent but equally plausible genetic variants (with a minor allele frequency ≤10%) are considered, which are highly sensitive to diagnostic errors (Figure 1b). Studies of rarer disease phenotypes (affecting <1 in 1000 adults) present an even greater challenge (Figure 1c). This includes well-recognized disorders [such as amyotrophic lateral sclerosis (ALS), or progressive supranuclear palsy (PSP)], or clinical subgroups of common disorders (such as cases of Parkinson's disease with dementia), where distinct genetic factors are thought to modulate the phenotype.


Detecting new neurodegenerative disease genes: does phenotype accuracy limit the horizon?

Samuels DC, Burn DJ, Chinnery PF - Trends Genet. (2009)

Power to detect a genetic association in the context of diagnostic errors. In each example, the probability of affected individuals being classified as controls is 1 × 10−5. Varying this parameter has negligible impact on power and/or optimal sample size for diseases that are present in <10% of the population [10]. (a) Power to detect an association between a common allele (allele frequency = 0.5; GRR = 1.1– 1.3 under a multiplicative model) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. Disease frequency = 0.01. (b) Power to detect an association between alleles of different frequency (0.5, 0.25, 0.1) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. GRR = 1.3, disease frequency = 0.01. (c) Power to detect an association between an allele (frequency = 0.125, GRR = 1.3) and diseases of different prevalence (0.01, 0.001, 0.0001) in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. (d) Ratio of the number of inaccurately phenotyped cases (nerror) to the number of accurately phenotyped cases (nnoerror) required to detect an association between an allele (frequency = 0.1, varying GRR from 1.1 to 1.3) and a disease (frequency = 0.01) with 95% power at varying degrees of diagnostic error at P < 5 × 10−7. All calculations used PAWE-PH Phenotype Edition [10].
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2824109&req=5

fig1: Power to detect a genetic association in the context of diagnostic errors. In each example, the probability of affected individuals being classified as controls is 1 × 10−5. Varying this parameter has negligible impact on power and/or optimal sample size for diseases that are present in <10% of the population [10]. (a) Power to detect an association between a common allele (allele frequency = 0.5; GRR = 1.1– 1.3 under a multiplicative model) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. Disease frequency = 0.01. (b) Power to detect an association between alleles of different frequency (0.5, 0.25, 0.1) and disease in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. GRR = 1.3, disease frequency = 0.01. (c) Power to detect an association between an allele (frequency = 0.125, GRR = 1.3) and diseases of different prevalence (0.01, 0.001, 0.0001) in 20 000 cases and 20 000 controls with varying degrees of diagnostic error at P < 5 × 10−7. (d) Ratio of the number of inaccurately phenotyped cases (nerror) to the number of accurately phenotyped cases (nnoerror) required to detect an association between an allele (frequency = 0.1, varying GRR from 1.1 to 1.3) and a disease (frequency = 0.01) with 95% power at varying degrees of diagnostic error at P < 5 × 10−7. All calculations used PAWE-PH Phenotype Edition [10].
Mentions: There have been several approaches to try and deal with the issue of diagnostic inaccuracy in neurodegenerative diseases (Box 1). In silico modelling has shown that increasing the sample size counterbalances diagnostic error [10], but that the relationship between statistical power and diagnostic accuracy is not linear; in addition, the sample size required to generate reasonable power increases dramatically with reduced diagnostic accuracy [10]. For strong genetic effects, the precise diagnosis might not be a key issue. For example, even when 15% of cases are incorrectly classified as Alzheimer disease, a study of 500 cases and 500 controls would have >70% power to detect the well-established association with the ɛ4 APOE allele (Online Supplementary Material Figure S1). However, the detection of hitherto unknown modest disease associations at the whole-genome level presents a greater challenge [11]. For common genetic variants exerting a modest effect [where the genome relative risk (GRR) is 1.3], a diagnostic error rate of ∼2% has little effect on statistical power (Figure 1a). However, >2% diagnostic error has a dramatic effect on power, especially when attention is drawn to lower-penetrance alleles (i.e. GRR ≤1.1), as proposed for many complex traits. This is further compounded when less frequent but equally plausible genetic variants (with a minor allele frequency ≤10%) are considered, which are highly sensitive to diagnostic errors (Figure 1b). Studies of rarer disease phenotypes (affecting <1 in 1000 adults) present an even greater challenge (Figure 1c). This includes well-recognized disorders [such as amyotrophic lateral sclerosis (ALS), or progressive supranuclear palsy (PSP)], or clinical subgroups of common disorders (such as cases of Parkinson's disease with dementia), where distinct genetic factors are thought to modulate the phenotype.

View Article: PubMed Central - PubMed

AUTOMATICALLY GENERATED EXCERPT
Please rate it.

Diagnostic revision also occurs in ∼1/3 of cases... Phenotypic misclassification reduces the power to detect a statistical association between a phenotype and specific allele for a given sample size... In silico modelling has shown that increasing the sample size counterbalances diagnostic error, but that the relationship between statistical power and diagnostic accuracy is not linear; in addition, the sample size required to generate reasonable power increases dramatically with reduced diagnostic accuracy... For strong genetic effects, the precise diagnosis might not be a key issue... However, >2% diagnostic error has a dramatic effect on power, especially when attention is drawn to lower-penetrance alleles (i.e. GRR ≤1.1), as proposed for many complex traits... Thus, to achieve the same effect, investigators could either improve the phenotypic accuracy and remove false-positive cases from an existing cohort, or they could inflate the number of cases by up to 400-fold to compensate for the diagnostic errors of up to 20% (Figure 1d)... We argue that it is more cost effective to improve phenotypic accuracy than it is to increase the sample size... For example, even when considering alleles with a modest effect (GRR = 1.3), increasing diagnostic accuracy from 90% to 95% would reduce the number of affected individuals needed by threefold while maintaining the same power... For biologically plausible risk alleles with a minor frequency of 10% conferring a GRR of 1.3, increasing diagnostic accuracy by 10% would mean genotyping ∼8000 rather than ∼750 000 cases... For relatively uncommon neurodegenerative diseases, such as ALS (which has a prevalence ∼1 in 20 000) and PSP (affecting ∼5 in 100 000), it might never be possible to assemble cohorts with >100 000 cases from a genetically homogeneous population; studies of uncommon alleles with modest effects will only be possible with an exceptionally high diagnostic accuracy, placing greater emphasis on autopsy-based series... Providing the disease is rare (<10% of the population), the age-related penetrance is not a major concern... Now that GWAS has helped to identify the ‘low hanging fruit’ in complex disease (i.e. common alleles with strong genetic effects), the emphasis shifts to the detection of the ∼20–100 low penetrance disease-specific variants thought to underpin most common complex traits, some of which might contribute to interindividual phenotypic variability.

Show MeSH