Limits...
Performance of statistical methods on CHARGE targeted sequencing data.

Xing C, Dupuis J, Cupples LA - BMC Genet. (2014)

Bottom Line: Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%.Power is generally low, although it is relatively larger for Score-Seq.Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Boston University, Boston, MA, USA. chuanhua.xing@gmail.com.

ABSTRACT

Background: The CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Sequencing Project is a national, collaborative effort from 3 studies: Framingham Heart Study (FHS), Cardiovascular Health Study (CHS), and Atherosclerosis Risk in Communities (ARIC). It uses a case-cohort design, whereby a random sample of study participants is enriched with participants in extremes of traits. Although statistical methods are available to investigate the role of rare variants, few have evaluated their performance in a case-cohort design.

Results: We evaluate several methods, including the sequence kernel association test (SKAT), Score-Seq, and weighted (Madsen and Browning) and unweighted burden tests. Using genotypes from the CHARGE targeted-sequencing project for FHS (n = 1096), we simulate phenotypes in a large population for 11 correlated traits and then sample individuals to mimic the CHARGE Sequencing study design. We evaluate type I error and power for 77 targeted regions.

Conclusions: We provide some guidelines on the performance of these aggregate-based tests to detect associations with rare variants when applied to case-cohort study designs, using CHARGE targeted sequencing data. Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%. Power is generally low, although it is relatively larger for Score-Seq. Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

Show MeSH

Related in: MedlinePlus

Comparison of Type I Error for trait 1 with alpha = 0.01. (a). Type I error for T1. (b). Type I error for Madson and Browning, 2009. (c). Type I error for SKAT [13]. (d). Type I error for Score-Seq [14].
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4197341&req=5

Fig1: Comparison of Type I Error for trait 1 with alpha = 0.01. (a). Type I error for T1. (b). Type I error for Madson and Browning, 2009. (c). Type I error for SKAT [13]. (d). Type I error for Score-Seq [14].

Mentions: Type I error, estimated by the number of rejections divided by the number of replicates (10,000), for the 77 regions for trait 1 is presented in Figure 1 for α = 0.01. Assuming the number of rejections follows a binomial distribution, we calculated the 95% confidence interval for the type I error, and its bounds are indicated as the two ends for each region in Figure 1. The horizontal solid line indicates the nominal level of α = 0.01. When the nominal level is within the 95% CI of type I error, we consider type I error to be properly controlled. We use numbers to indicate regions for simplicity of presentation. The mapping from region numbers to gene names and their chromosomes and positions are given in Additional file 1: Table S1 in the supplementary file.Figure 1


Performance of statistical methods on CHARGE targeted sequencing data.

Xing C, Dupuis J, Cupples LA - BMC Genet. (2014)

Comparison of Type I Error for trait 1 with alpha = 0.01. (a). Type I error for T1. (b). Type I error for Madson and Browning, 2009. (c). Type I error for SKAT [13]. (d). Type I error for Score-Seq [14].
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4197341&req=5

Fig1: Comparison of Type I Error for trait 1 with alpha = 0.01. (a). Type I error for T1. (b). Type I error for Madson and Browning, 2009. (c). Type I error for SKAT [13]. (d). Type I error for Score-Seq [14].
Mentions: Type I error, estimated by the number of rejections divided by the number of replicates (10,000), for the 77 regions for trait 1 is presented in Figure 1 for α = 0.01. Assuming the number of rejections follows a binomial distribution, we calculated the 95% confidence interval for the type I error, and its bounds are indicated as the two ends for each region in Figure 1. The horizontal solid line indicates the nominal level of α = 0.01. When the nominal level is within the 95% CI of type I error, we consider type I error to be properly controlled. We use numbers to indicate regions for simplicity of presentation. The mapping from region numbers to gene names and their chromosomes and positions are given in Additional file 1: Table S1 in the supplementary file.Figure 1

Bottom Line: Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%.Power is generally low, although it is relatively larger for Score-Seq.Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Boston University, Boston, MA, USA. chuanhua.xing@gmail.com.

ABSTRACT

Background: The CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Sequencing Project is a national, collaborative effort from 3 studies: Framingham Heart Study (FHS), Cardiovascular Health Study (CHS), and Atherosclerosis Risk in Communities (ARIC). It uses a case-cohort design, whereby a random sample of study participants is enriched with participants in extremes of traits. Although statistical methods are available to investigate the role of rare variants, few have evaluated their performance in a case-cohort design.

Results: We evaluate several methods, including the sequence kernel association test (SKAT), Score-Seq, and weighted (Madsen and Browning) and unweighted burden tests. Using genotypes from the CHARGE targeted-sequencing project for FHS (n = 1096), we simulate phenotypes in a large population for 11 correlated traits and then sample individuals to mimic the CHARGE Sequencing study design. We evaluate type I error and power for 77 targeted regions.

Conclusions: We provide some guidelines on the performance of these aggregate-based tests to detect associations with rare variants when applied to case-cohort study designs, using CHARGE targeted sequencing data. Type I error is conservative when we consider variants with minor allele frequency (MAF) < 1%. Power is generally low, although it is relatively larger for Score-Seq. Greater numbers of causal variants and a greater proportion of variance improve the power, but it tends to be lower in the presence of bi-directionality of effects of causal genotypes, especially for Score-Seq.

Show MeSH
Related in: MedlinePlus