Limits...
Performance of statistical methods on CHARGE targeted sequencing data.

Xing C, Dupuis J, Cupples LA - BMC Genet. (2014)

Bottom Line: Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%.Power is generally low, although it is relatively larger for Score-Seq.Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Boston University, Boston, MA, USA. chuanhua.xing@gmail.com.

ABSTRACT

Background: The CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Sequencing Project is a national, collaborative effort from 3 studies: Framingham Heart Study (FHS), Cardiovascular Health Study (CHS), and Atherosclerosis Risk in Communities (ARIC). It uses a case-cohort design, whereby a random sample of study participants is enriched with participants in extremes of traits. Although statistical methods are available to investigate the role of rare variants, few have evaluated their performance in a case-cohort design.

Results: We evaluate several methods, including the sequence kernel association test (SKAT), Score-Seq, and weighted (Madsen and Browning) and unweighted burden tests. Using genotypes from the CHARGE targeted-sequencing project for FHS (n = 1096), we simulate phenotypes in a large population for 11 correlated traits and then sample individuals to mimic the CHARGE Sequencing study design. We evaluate type I error and power for 77 targeted regions.

Conclusions: We provide some guidelines on the performance of these aggregate-based tests to detect associations with rare variants when applied to case-cohort study designs, using CHARGE targeted sequencing data. Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%. Power is generally low, although it is relatively larger for Score-Seq. Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

Show MeSH

Related in: MedlinePlus

Type I Error for SKAT over different traits when alpha = 0.01. (a). Type I error for trait 1. (b). Type I error for trait 6. (c) Type I error for trait 10.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4197341&req=5

Fig2: Type I Error for SKAT over different traits when alpha = 0.01. (a). Type I error for trait 1. (b). Type I error for trait 6. (c) Type I error for trait 10.

Mentions: Next, we examine the variation in type I error over traits that are correlated using SKAT (Figure 2 for alpha = 0.01). We omit traits 2 and 9 because their results are similar to traits 1 and 10. Type I error over the different traits is similar and well controlled, except for three regions for trait 6 (regions 7, 30 and 32) with inflated type I error. Further investigation indicates that region 7 has 194 variants, region 30 has 9 variants, and region 32 has 6 variants with MAF < 1%. The smaller number of variants for regions 30 and 32 may increase the risk of having elevated type I error for a region, because we treated each targeted region as a unit to jointly analyze rare variants regardless of the length of a region. However, there are other regions that have fewer variants with MAF < 1% such as region 40 with 3 variants and region 9 with 4 variants, but these two regions have appropriate type I error. More summary statistics, such as the number of variants in each region, can be found from Additional file 1: Table S1 in the Supplement. We also examined type I error over traits when alpha = 0.001 and 0.05 (Additional file 1: Figure S2 in the Supplement). When alpha is smaller with a value of 0.001 or larger with a value of 0.05, the type I error for all traits tends to be conservative, except for the three regions for trait 6. The type I error for the three regions for trait 6 tend to be smaller when alpha is smaller with value of 0.001, but tend to be larger when alpha is large with value of 0.05.Figure 2


Performance of statistical methods on CHARGE targeted sequencing data.

Xing C, Dupuis J, Cupples LA - BMC Genet. (2014)

Type I Error for SKAT over different traits when alpha = 0.01. (a). Type I error for trait 1. (b). Type I error for trait 6. (c) Type I error for trait 10.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4197341&req=5

Fig2: Type I Error for SKAT over different traits when alpha = 0.01. (a). Type I error for trait 1. (b). Type I error for trait 6. (c) Type I error for trait 10.
Mentions: Next, we examine the variation in type I error over traits that are correlated using SKAT (Figure 2 for alpha = 0.01). We omit traits 2 and 9 because their results are similar to traits 1 and 10. Type I error over the different traits is similar and well controlled, except for three regions for trait 6 (regions 7, 30 and 32) with inflated type I error. Further investigation indicates that region 7 has 194 variants, region 30 has 9 variants, and region 32 has 6 variants with MAF < 1%. The smaller number of variants for regions 30 and 32 may increase the risk of having elevated type I error for a region, because we treated each targeted region as a unit to jointly analyze rare variants regardless of the length of a region. However, there are other regions that have fewer variants with MAF < 1% such as region 40 with 3 variants and region 9 with 4 variants, but these two regions have appropriate type I error. More summary statistics, such as the number of variants in each region, can be found from Additional file 1: Table S1 in the Supplement. We also examined type I error over traits when alpha = 0.001 and 0.05 (Additional file 1: Figure S2 in the Supplement). When alpha is smaller with a value of 0.001 or larger with a value of 0.05, the type I error for all traits tends to be conservative, except for the three regions for trait 6. The type I error for the three regions for trait 6 tend to be smaller when alpha is smaller with value of 0.001, but tend to be larger when alpha is large with value of 0.05.Figure 2

Bottom Line: Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%.Power is generally low, although it is relatively larger for Score-Seq.Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Boston University, Boston, MA, USA. chuanhua.xing@gmail.com.

ABSTRACT

Background: The CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Sequencing Project is a national, collaborative effort from 3 studies: Framingham Heart Study (FHS), Cardiovascular Health Study (CHS), and Atherosclerosis Risk in Communities (ARIC). It uses a case-cohort design, whereby a random sample of study participants is enriched with participants in extremes of traits. Although statistical methods are available to investigate the role of rare variants, few have evaluated their performance in a case-cohort design.

Results: We evaluate several methods, including the sequence kernel association test (SKAT), Score-Seq, and weighted (Madsen and Browning) and unweighted burden tests. Using genotypes from the CHARGE targeted-sequencing project for FHS (n = 1096), we simulate phenotypes in a large population for 11 correlated traits and then sample individuals to mimic the CHARGE Sequencing study design. We evaluate type I error and power for 77 targeted regions.

Conclusions: We provide some guidelines on the performance of these aggregate-based tests to detect associations with rare variants when applied to case-cohort study designs, using CHARGE targeted sequencing data. Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%. Power is generally low, although it is relatively larger for Score-Seq. Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

Show MeSH
Related in: MedlinePlus