Limits...
Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure.

Yee J, Kwon MS, Jin S, Park T, Park M - Biomed Res Int (2015)

Bottom Line: Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits.Hence, the information gain can be obtained for any phenotype distribution.Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

View Article: PubMed Central - PubMed

Affiliation: Department of Physiology and Biophysics, Eulji University, Daejeon, Republic of Korea.

ABSTRACT
A number of statistical methods for detecting gene-gene interactions have been developed in genetic association studies with binary traits. However, many phenotype measures are intrinsically quantitative and categorizing continuous traits may not always be straightforward and meaningful. Association of gene-gene interactions with an observed distribution of such phenotypes needs to be investigated directly without categorization. Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits. We extend the usefulness of this information gain by proposing a nonparametric evaluation method of conditional entropy of a quantitative phenotype associated with a given genotype. Hence, the information gain can be obtained for any phenotype distribution. Because any functional form, such as Gaussian, is not assumed for the entire distribution of a trait or a given genotype, this method is expected to be robust enough to be applied to any phenotypic association data. Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

No MeSH data available.


Demonstration of the simulation scheme. Phenotype distributions were plotted to associate with the genotypes by two interacting SNPs, as denoted in the parentheses on top of each plot. SNPs may take values of 0, 1, and 2 or AA, Aa, and aa. For this particular dataset, the MAF was set to 0.200. On the bottom of each plot, the penetrance value for this particular model is given, which is taken from [21]. Inside each plot, the number of samples generated to satisfy the simulation constraint is given. The vertical dotted lines are for the mean values of the high- and low-risk groups. By constraint, the line on the left is for the low-risk group.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4538333&req=5

fig4: Demonstration of the simulation scheme. Phenotype distributions were plotted to associate with the genotypes by two interacting SNPs, as denoted in the parentheses on top of each plot. SNPs may take values of 0, 1, and 2 or AA, Aa, and aa. For this particular dataset, the MAF was set to 0.200. On the bottom of each plot, the penetrance value for this particular model is given, which is taken from [21]. Inside each plot, the number of samples generated to satisfy the simulation constraint is given. The vertical dotted lines are for the mean values of the high- and low-risk groups. By constraint, the line on the left is for the low-risk group.

Mentions: To examine the performance of the m-spacing method, an extensive set of simulation data was necessarily generated. First, three types of quantitative trait distributions were considered. Two of them were normal and gamma distributions, and another one was a mixture of those two types. With single causal pair designed, 70 different penetrance models, based on [21], were incorporated. For the case of a normal distribution, a phenotype value, y, associated with two interacting SNPs was selected from a normal distribution, as defined by the penetrance values tabled for possible combinations of genotypes associated as follows:(12)y ∣ SNP1=i,SNP2=j~Nfij,σ2.Here fij represents the penetrance values tabled for every model simulated and can be found in [21]. It is tabulated for each possible pair of genotypes, (i, j). In 70 different penetrance models, 14 different combinations of two different minor allele frequencies (MAFs) and seven different heritability values were considered. Specifically, we considered the cases when the MAFs were 0.2 and 0.4 and when the heritability was 0.01, 0.025, 0.05, 0.1, 0.2, 0.3, and 0.4. Three different values (0.8, 1.0, and 1.2) of the variance, σ, were used independently for the high- and low-risk groups, resulting in 9 combinations. The grouping constraint for the generated event was set such that the averaged y of the high-risk group should be larger than or equal to the overall average. The averaged y of the low-risk group should be less than the overall average. In Figure 4, 9 possible distributions of a generated phenotype are shown. In this example, the sample size is 400. The high- and low-risk groups have the same number of samples and both have a variance of 1.0. For gamma distributions, phenotype values follow the rule below:(13)y ∣ SNP1=i,SNP2=j~Γk,θ.The shape and scale parameters, k and θ, were determined by fij and σ, using the relationship fij = kθ and σ = kθ2. Penetrance models were classified by 7 heritability values: 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, and 0.4, resulting in 10 models for each heritability. The generated data files had a sample size of 400, with 20 SNPs. In all, 3 × 70 × 9 = 1,890 different conditions were set up, with 100 simulated data files generated for each condition.


Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure.

Yee J, Kwon MS, Jin S, Park T, Park M - Biomed Res Int (2015)

Demonstration of the simulation scheme. Phenotype distributions were plotted to associate with the genotypes by two interacting SNPs, as denoted in the parentheses on top of each plot. SNPs may take values of 0, 1, and 2 or AA, Aa, and aa. For this particular dataset, the MAF was set to 0.200. On the bottom of each plot, the penetrance value for this particular model is given, which is taken from [21]. Inside each plot, the number of samples generated to satisfy the simulation constraint is given. The vertical dotted lines are for the mean values of the high- and low-risk groups. By constraint, the line on the left is for the low-risk group.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4538333&req=5

fig4: Demonstration of the simulation scheme. Phenotype distributions were plotted to associate with the genotypes by two interacting SNPs, as denoted in the parentheses on top of each plot. SNPs may take values of 0, 1, and 2 or AA, Aa, and aa. For this particular dataset, the MAF was set to 0.200. On the bottom of each plot, the penetrance value for this particular model is given, which is taken from [21]. Inside each plot, the number of samples generated to satisfy the simulation constraint is given. The vertical dotted lines are for the mean values of the high- and low-risk groups. By constraint, the line on the left is for the low-risk group.
Mentions: To examine the performance of the m-spacing method, an extensive set of simulation data was necessarily generated. First, three types of quantitative trait distributions were considered. Two of them were normal and gamma distributions, and another one was a mixture of those two types. With single causal pair designed, 70 different penetrance models, based on [21], were incorporated. For the case of a normal distribution, a phenotype value, y, associated with two interacting SNPs was selected from a normal distribution, as defined by the penetrance values tabled for possible combinations of genotypes associated as follows:(12)y ∣ SNP1=i,SNP2=j~Nfij,σ2.Here fij represents the penetrance values tabled for every model simulated and can be found in [21]. It is tabulated for each possible pair of genotypes, (i, j). In 70 different penetrance models, 14 different combinations of two different minor allele frequencies (MAFs) and seven different heritability values were considered. Specifically, we considered the cases when the MAFs were 0.2 and 0.4 and when the heritability was 0.01, 0.025, 0.05, 0.1, 0.2, 0.3, and 0.4. Three different values (0.8, 1.0, and 1.2) of the variance, σ, were used independently for the high- and low-risk groups, resulting in 9 combinations. The grouping constraint for the generated event was set such that the averaged y of the high-risk group should be larger than or equal to the overall average. The averaged y of the low-risk group should be less than the overall average. In Figure 4, 9 possible distributions of a generated phenotype are shown. In this example, the sample size is 400. The high- and low-risk groups have the same number of samples and both have a variance of 1.0. For gamma distributions, phenotype values follow the rule below:(13)y ∣ SNP1=i,SNP2=j~Γk,θ.The shape and scale parameters, k and θ, were determined by fij and σ, using the relationship fij = kθ and σ = kθ2. Penetrance models were classified by 7 heritability values: 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, and 0.4, resulting in 10 models for each heritability. The generated data files had a sample size of 400, with 20 SNPs. In all, 3 × 70 × 9 = 1,890 different conditions were set up, with 100 simulated data files generated for each condition.

Bottom Line: Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits.Hence, the information gain can be obtained for any phenotype distribution.Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

View Article: PubMed Central - PubMed

Affiliation: Department of Physiology and Biophysics, Eulji University, Daejeon, Republic of Korea.

ABSTRACT
A number of statistical methods for detecting gene-gene interactions have been developed in genetic association studies with binary traits. However, many phenotype measures are intrinsically quantitative and categorizing continuous traits may not always be straightforward and meaningful. Association of gene-gene interactions with an observed distribution of such phenotypes needs to be investigated directly without categorization. Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits. We extend the usefulness of this information gain by proposing a nonparametric evaluation method of conditional entropy of a quantitative phenotype associated with a given genotype. Hence, the information gain can be obtained for any phenotype distribution. Because any functional form, such as Gaussian, is not assumed for the entire distribution of a trait or a given genotype, this method is expected to be robust enough to be applied to any phenotypic association data. Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

No MeSH data available.