Limits...
Power and sample size estimation for epigenome-wide association scans to detect differential DNA methylation.

Tsai PC, Bell JT - Int J Epidemiol (2015)

Bottom Line: We performed simulations to estimate power under the case-control and discordant MZ twin EWAS study designs, under a range of epigenetic risk effect sizes and conditions.Our analyses highlighted several factors that significantly influenced EWAS power, including sample size, epigenetic risk effect size, the variance of DNA methylation at the locus of interest and the correlation in DNA methylation patterns within the twin sample.Our results can help guide EWAS experimental design and interpretation for future epigenetic studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.

No MeSH data available.


DNA methylation patterns at the (A) cellular and individual levels, and (B) with respect to the proposed methylation distributions in the simulations. We assume that a cell can have two methylated alleles (ei = 1), one methylated allele (ei = 0.5) or two unmethylated alleles (ei = 0), and one sample from an individual contains different frequencies of these cells (A, upper panel). The methylated allele is shown as a dagger symbol, and the colour of each cell represents its methylation status: un-methylated (white), hemi-methylated (grey) and methylated (black) (A, upper panel). The methylation in each sample is represented as the summary of the methylated epi-allele, denoted here as beta (A, middle panel) which can range from 0 to 1 (A, lower panel). We assume that cases have greater mean methylation levels compared with controls, and we propose one control and eight case distributions. (B) Each line represents the density of methylation levels on each proposed distribution, where the Control distribution is un-methylated, Cases 1–3 represent predominantly un-methylated samples (left panel), Cases 4–6 are hemi-methylated (middle panel) and Cases 7–8 are predominantly methylated (right panel).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4588864&req=5

dyv041-F1: DNA methylation patterns at the (A) cellular and individual levels, and (B) with respect to the proposed methylation distributions in the simulations. We assume that a cell can have two methylated alleles (ei = 1), one methylated allele (ei = 0.5) or two unmethylated alleles (ei = 0), and one sample from an individual contains different frequencies of these cells (A, upper panel). The methylated allele is shown as a dagger symbol, and the colour of each cell represents its methylation status: un-methylated (white), hemi-methylated (grey) and methylated (black) (A, upper panel). The methylation in each sample is represented as the summary of the methylated epi-allele, denoted here as beta (A, middle panel) which can range from 0 to 1 (A, lower panel). We assume that cases have greater mean methylation levels compared with controls, and we propose one control and eight case distributions. (B) Each line represents the density of methylation levels on each proposed distribution, where the Control distribution is un-methylated, Cases 1–3 represent predominantly un-methylated samples (left panel), Cases 4–6 are hemi-methylated (middle panel) and Cases 7–8 are predominantly methylated (right panel).

Mentions: We assume that disease risk is affected by DNA methylation at a single locus, l (Figure 1A, upper panel), where l represents a single CpG site in the genome. The methylation status at locus l in a single cell can be represented as a biallelic marker, where epi-allele 1 represents the presence of the methylated mark, and epi-allele 0 represents the absence of methylation. We assume that the disease-associated methylation mark occurs prior to onset of disease and is faithfully transmitted through mitotic cell division. We denote DNA methylation status (epi-genotype) at locus l as ej, where the ej takes the value of 0, 0.5, and 1 to correspond to un-methylated, hemi-methylated and methylated states for a single cell. Each individual cell can consist of un-methylated, hemi-methylated and methylated epi-genotypes with probabilities of p1, p2 and p3, where p1 + p2 + p3 = 1. A sample from an individual i represents a population of cells (Figure 1A, middle panel), and we assume that the contribution of each cell to the population is constant and without bias. The sample-level DNA methylation estimate is a function of the methylation levels of the composition of cells (Figure 1A, lower panel), and can be described by different functions or epigenetic models. In this study, we propose a threshold model where the sample-level DNA methylation estimate reflects the allele frequency of the methylated epi-allele 1 in the cell population. That is, DNA methylation level for each sample is denoted as β (beta), which represents the sum of its fully methylated cells plus half of its hemi-methylated cells, divided by the total number of cells in the sample. In addition to the proposed DNA methylation threshold model, dominant and recessive models may also be applied, as proposed for genetic disease susceptibility risk.Figure 1.


Power and sample size estimation for epigenome-wide association scans to detect differential DNA methylation.

Tsai PC, Bell JT - Int J Epidemiol (2015)

DNA methylation patterns at the (A) cellular and individual levels, and (B) with respect to the proposed methylation distributions in the simulations. We assume that a cell can have two methylated alleles (ei = 1), one methylated allele (ei = 0.5) or two unmethylated alleles (ei = 0), and one sample from an individual contains different frequencies of these cells (A, upper panel). The methylated allele is shown as a dagger symbol, and the colour of each cell represents its methylation status: un-methylated (white), hemi-methylated (grey) and methylated (black) (A, upper panel). The methylation in each sample is represented as the summary of the methylated epi-allele, denoted here as beta (A, middle panel) which can range from 0 to 1 (A, lower panel). We assume that cases have greater mean methylation levels compared with controls, and we propose one control and eight case distributions. (B) Each line represents the density of methylation levels on each proposed distribution, where the Control distribution is un-methylated, Cases 1–3 represent predominantly un-methylated samples (left panel), Cases 4–6 are hemi-methylated (middle panel) and Cases 7–8 are predominantly methylated (right panel).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4588864&req=5

dyv041-F1: DNA methylation patterns at the (A) cellular and individual levels, and (B) with respect to the proposed methylation distributions in the simulations. We assume that a cell can have two methylated alleles (ei = 1), one methylated allele (ei = 0.5) or two unmethylated alleles (ei = 0), and one sample from an individual contains different frequencies of these cells (A, upper panel). The methylated allele is shown as a dagger symbol, and the colour of each cell represents its methylation status: un-methylated (white), hemi-methylated (grey) and methylated (black) (A, upper panel). The methylation in each sample is represented as the summary of the methylated epi-allele, denoted here as beta (A, middle panel) which can range from 0 to 1 (A, lower panel). We assume that cases have greater mean methylation levels compared with controls, and we propose one control and eight case distributions. (B) Each line represents the density of methylation levels on each proposed distribution, where the Control distribution is un-methylated, Cases 1–3 represent predominantly un-methylated samples (left panel), Cases 4–6 are hemi-methylated (middle panel) and Cases 7–8 are predominantly methylated (right panel).
Mentions: We assume that disease risk is affected by DNA methylation at a single locus, l (Figure 1A, upper panel), where l represents a single CpG site in the genome. The methylation status at locus l in a single cell can be represented as a biallelic marker, where epi-allele 1 represents the presence of the methylated mark, and epi-allele 0 represents the absence of methylation. We assume that the disease-associated methylation mark occurs prior to onset of disease and is faithfully transmitted through mitotic cell division. We denote DNA methylation status (epi-genotype) at locus l as ej, where the ej takes the value of 0, 0.5, and 1 to correspond to un-methylated, hemi-methylated and methylated states for a single cell. Each individual cell can consist of un-methylated, hemi-methylated and methylated epi-genotypes with probabilities of p1, p2 and p3, where p1 + p2 + p3 = 1. A sample from an individual i represents a population of cells (Figure 1A, middle panel), and we assume that the contribution of each cell to the population is constant and without bias. The sample-level DNA methylation estimate is a function of the methylation levels of the composition of cells (Figure 1A, lower panel), and can be described by different functions or epigenetic models. In this study, we propose a threshold model where the sample-level DNA methylation estimate reflects the allele frequency of the methylated epi-allele 1 in the cell population. That is, DNA methylation level for each sample is denoted as β (beta), which represents the sum of its fully methylated cells plus half of its hemi-methylated cells, divided by the total number of cells in the sample. In addition to the proposed DNA methylation threshold model, dominant and recessive models may also be applied, as proposed for genetic disease susceptibility risk.Figure 1.

Bottom Line: We performed simulations to estimate power under the case-control and discordant MZ twin EWAS study designs, under a range of epigenetic risk effect sizes and conditions.Our analyses highlighted several factors that significantly influenced EWAS power, including sample size, epigenetic risk effect size, the variance of DNA methylation at the locus of interest and the correlation in DNA methylation patterns within the twin sample.Our results can help guide EWAS experimental design and interpretation for future epigenetic studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.

No MeSH data available.