Limits...
Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure.

Yee J, Kwon MS, Jin S, Park T, Park M - Biomed Res Int (2015)

Bottom Line: Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits.Hence, the information gain can be obtained for any phenotype distribution.Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

View Article: PubMed Central - PubMed

Affiliation: Department of Physiology and Biophysics, Eulji University, Daejeon, Republic of Korea.

ABSTRACT
A number of statistical methods for detecting gene-gene interactions have been developed in genetic association studies with binary traits. However, many phenotype measures are intrinsically quantitative and categorizing continuous traits may not always be straightforward and meaningful. Association of gene-gene interactions with an observed distribution of such phenotypes needs to be investigated directly without categorization. Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits. We extend the usefulness of this information gain by proposing a nonparametric evaluation method of conditional entropy of a quantitative phenotype associated with a given genotype. Hence, the information gain can be obtained for any phenotype distribution. Because any functional form, such as Gaussian, is not assumed for the entire distribution of a trait or a given genotype, this method is expected to be robust enough to be applied to any phenotypic association data. Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

No MeSH data available.


Comparison of the hit ratios or the detection probabilities among the proposed m-spacing method, QMDR, and GMDR. Genomic datasets were generated based on 70 different penetrance functions [21], which were, in turn, classified into 7 distinct values of heritability. For each model, the phenotype values are simulated with normal (a), gamma (b), and mixed (c) distributions. High- and low-risk groups in a quantitative trait overlapped with 9 different combinations of the standard deviations. Considering all of the above, 100 data files were generated for each case, adding up to 9,000 simulated files being examined for each point in the plot.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4538333&req=5

fig5: Comparison of the hit ratios or the detection probabilities among the proposed m-spacing method, QMDR, and GMDR. Genomic datasets were generated based on 70 different penetrance functions [21], which were, in turn, classified into 7 distinct values of heritability. For each model, the phenotype values are simulated with normal (a), gamma (b), and mixed (c) distributions. High- and low-risk groups in a quantitative trait overlapped with 9 different combinations of the standard deviations. Considering all of the above, 100 data files were generated for each case, adding up to 9,000 simulated files being examined for each point in the plot.

Mentions: The “hit ratio,” or detection power, of the IGS was evaluated and compared. Simulated data files described in the previous subsection were used. All of them had a single causal pair to identify. In addition to our proposed m-spacing method, QMDR and GMDR were used to compare the results. Figure 5 shows the comparison. Panels (a), (b), and (c) are for the quantitative trait of normal, gamma, and mixed distributions, respectively. Seventy penetrance models were grouped into 7 cases of heritability on the horizontal axis, while all 9 combinations of the variances in high- and low-risk distributions were merged into each heritability case. With a normal distribution, as shown in Figure 5(a), the m-spacing's performance was in between those of QMDR and GMDR for higher values of penetrance. However, in the range of penetrance less than 0.2, the m-spacing performs best. Note that the QMDR shows higher detection probability than the GMDR throughout the range. In the case of a gamma distribution, as shown in Figure 5(b), the QMDR's performance drops rapidly, as the heritability decreases when the hit ratios of m-spacing, as well as the GMDR, stay better than that of QMDR and are comparable to each other. Note the switch of the GMDR and QMDR's performance ranks with the change of the phenotype distribution. What QMDR does is essentially the dichotomization of the observed values of the quantitative phenotypes. Therefore, it should do better with well-defined symmetric distributions, such as a normal distribution, than with an asymmetric one (e.g., gamma distribution). The proposed m-spacing method is expected to be effective regardless of the shape of the phenotype distribution, because it makes no assumptions regarding the distribution and is therefore nonparametric, as demonstrated in Figures 5(a) and 5(b). This nonparameterization is again confirmed in Figure 5(c), showing that m-spacing outperforms the QMDR and the GMDR, throughout the whole range of heritability, in the case of the mixed form of phenotype distribution. Among the three methods examined, m-spacing was the most robust, performing consistently within the range of conditions for the simulation.


Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure.

Yee J, Kwon MS, Jin S, Park T, Park M - Biomed Res Int (2015)

Comparison of the hit ratios or the detection probabilities among the proposed m-spacing method, QMDR, and GMDR. Genomic datasets were generated based on 70 different penetrance functions [21], which were, in turn, classified into 7 distinct values of heritability. For each model, the phenotype values are simulated with normal (a), gamma (b), and mixed (c) distributions. High- and low-risk groups in a quantitative trait overlapped with 9 different combinations of the standard deviations. Considering all of the above, 100 data files were generated for each case, adding up to 9,000 simulated files being examined for each point in the plot.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4538333&req=5

fig5: Comparison of the hit ratios or the detection probabilities among the proposed m-spacing method, QMDR, and GMDR. Genomic datasets were generated based on 70 different penetrance functions [21], which were, in turn, classified into 7 distinct values of heritability. For each model, the phenotype values are simulated with normal (a), gamma (b), and mixed (c) distributions. High- and low-risk groups in a quantitative trait overlapped with 9 different combinations of the standard deviations. Considering all of the above, 100 data files were generated for each case, adding up to 9,000 simulated files being examined for each point in the plot.
Mentions: The “hit ratio,” or detection power, of the IGS was evaluated and compared. Simulated data files described in the previous subsection were used. All of them had a single causal pair to identify. In addition to our proposed m-spacing method, QMDR and GMDR were used to compare the results. Figure 5 shows the comparison. Panels (a), (b), and (c) are for the quantitative trait of normal, gamma, and mixed distributions, respectively. Seventy penetrance models were grouped into 7 cases of heritability on the horizontal axis, while all 9 combinations of the variances in high- and low-risk distributions were merged into each heritability case. With a normal distribution, as shown in Figure 5(a), the m-spacing's performance was in between those of QMDR and GMDR for higher values of penetrance. However, in the range of penetrance less than 0.2, the m-spacing performs best. Note that the QMDR shows higher detection probability than the GMDR throughout the range. In the case of a gamma distribution, as shown in Figure 5(b), the QMDR's performance drops rapidly, as the heritability decreases when the hit ratios of m-spacing, as well as the GMDR, stay better than that of QMDR and are comparable to each other. Note the switch of the GMDR and QMDR's performance ranks with the change of the phenotype distribution. What QMDR does is essentially the dichotomization of the observed values of the quantitative phenotypes. Therefore, it should do better with well-defined symmetric distributions, such as a normal distribution, than with an asymmetric one (e.g., gamma distribution). The proposed m-spacing method is expected to be effective regardless of the shape of the phenotype distribution, because it makes no assumptions regarding the distribution and is therefore nonparametric, as demonstrated in Figures 5(a) and 5(b). This nonparameterization is again confirmed in Figure 5(c), showing that m-spacing outperforms the QMDR and the GMDR, throughout the whole range of heritability, in the case of the mixed form of phenotype distribution. Among the three methods examined, m-spacing was the most robust, performing consistently within the range of conditions for the simulation.

Bottom Line: Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits.Hence, the information gain can be obtained for any phenotype distribution.Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

View Article: PubMed Central - PubMed

Affiliation: Department of Physiology and Biophysics, Eulji University, Daejeon, Republic of Korea.

ABSTRACT
A number of statistical methods for detecting gene-gene interactions have been developed in genetic association studies with binary traits. However, many phenotype measures are intrinsically quantitative and categorizing continuous traits may not always be straightforward and meaningful. Association of gene-gene interactions with an observed distribution of such phenotypes needs to be investigated directly without categorization. Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits. We extend the usefulness of this information gain by proposing a nonparametric evaluation method of conditional entropy of a quantitative phenotype associated with a given genotype. Hence, the information gain can be obtained for any phenotype distribution. Because any functional form, such as Gaussian, is not assumed for the entire distribution of a trait or a given genotype, this method is expected to be robust enough to be applied to any phenotypic association data. Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.

No MeSH data available.