Limits...
Modeling the complex gene x environment interplay in the simulated rheumatoid arthritis GAW15 data using latent variable structural equation modeling.

Nock NL, Larkin EK, Morris NJ, Li Y, Stein CM - BMC Proc (2007)

Bottom Line: Our approach holds promise in unravelling complex diseases and improves upon current "one SNP (haplotype)-at-a-time" regression approaches by decreasing the number of statistical tests while minimizing problems with multicolinearity and haplotype estimation algorithm error.Furthermore, when genes are modeled as latent constructs simultaneously with other key cofactors, the approach provides enhanced control of confounding that should lead to less biased effect estimates among genes as well as between gene(s) and the complex disease.Moreover, because some a priori biological information is needed to form an initial substantive model, our approach may be most appropriate for candidate gene SNP panel applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Genetic and Molecular Epidemiology, Case Western Reserve University, Cleveland, OH 44106-7281, USA. nln@case.edu

ABSTRACT
Rheumatoid arthritis is a complex disease that appears to involve multiple genetic and environmental factors. Using the Genetic Analysis Workshop 15 simulated rheumatoid arthritis data and the structural equation modeling framework, we tested hypothesized "causal" rheumatoid arthritis model(s) by employing a novel latent gene construct approach that models individual genes as latent variables defined by multiple dense and non-dense single-nucleotide polymorphisms (SNPs). Our approach produced valid latent gene constructs, particularly with dense SNPs, which when coupled with other factors involved in rheumatoid arthritis, were able to generate good fitting models by certain goodness of fit indices. We observed that Gene F, C, DR, sex and smoking were significant predictors of rheumatoid arthritis but Genes A and E were not, which was generally, but not entirely, consistent with how the data were simulated. Our approach holds promise in unravelling complex diseases and improves upon current "one SNP (haplotype)-at-a-time" regression approaches by decreasing the number of statistical tests while minimizing problems with multicolinearity and haplotype estimation algorithm error. Furthermore, when genes are modeled as latent constructs simultaneously with other key cofactors, the approach provides enhanced control of confounding that should lead to less biased effect estimates among genes as well as between gene(s) and the complex disease. However, further study is needed to quantify bias, evaluate fit index disparity, and resolve multiplicative latent gene interactions. Moreover, because some a priori biological information is needed to form an initial substantive model, our approach may be most appropriate for candidate gene SNP panel applications.

No MeSH data available.


Related in: MedlinePlus

Evaluation of the GAW15 simulated rheumatoidarthritis (RA) model. Measurement model loadings depict relationships between observed variables (rectangles) and latent variables (ovals) and structural model path coefficients depict relationships between latent variables. Corresponding standard errors in parentheses are above single-headed arrows. Correlations are above double-headed arrows. Red arrows indicate the simulated locus location.*p-value ≤ 0.05.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2367478&req=5

Figure 1: Evaluation of the GAW15 simulated rheumatoidarthritis (RA) model. Measurement model loadings depict relationships between observed variables (rectangles) and latent variables (ovals) and structural model path coefficients depict relationships between latent variables. Corresponding standard errors in parentheses are above single-headed arrows. Correlations are above double-headed arrows. Red arrows indicate the simulated locus location.*p-value ≤ 0.05.

Mentions: We analyzed a "full" model with all genes (Gene A, C, D, E, and F), gender and smoking as covariates, and RA as a dichotomous outcome (Fig. 1) and obtained a good fitting model by CFI (0.96) but not RMSEA (0.12) or WRMR (6.53) fit index standards. To obtain convergence, we had to remove two SNPs (dSNP6_3918; dSNP6_3919) initially used in constructing latent variable Gene D. We could not determine the exact source of this problem but speculate it may have been due to some type of linear dependency between these SNPs because of the "weak" LD simulated between these loci. Nevertheless, removing the two SNPs did not alter the validity of Gene D (eigenvalue = 3.21; AVE = 0.59). The specific measurement model loadings and path coefficients for the "full" model are shown in Figure 1. The largest significant path coefficient between a gene construct and RA was observed with Gene C (β = -0.609 ± standard error of β = 0.020; p ≤ 0.05) and the inverse nature of this association may reflect the increased risk simulated with the wild-type "C" allele. Gene C was also highly correlated with DR (ρ = 0.895 ± 0.031), which was expected given that Locus C was simulated to be in complete LD with DR (D' = 1.0). We also found a strong positive path coefficient between Gene F and RA (β = 0.274 ± 0.033; p ≤ 0.05), which was expected because Locus F was simulated to confer risk from IgM on RA. However, the path coefficient between Gene D, which was simulated to be in "weak" LD with DR, and RA (β = 0.024 ± 0.027) was trivial in our model. We hypothesize the very low "D" risk allele frequency simulated contributed to this discrepancy. In addition, although the effects of DR were simulated to be controlled by Locus A, the path between Gene A and DR was negligible (β = 0.001 ± 0.022). This, however, may have been driven by our inability to devise a good construct for Locus A. When we added IgM (and/or anti-CCP), we did not obtain convergence, which was likely because of the skewed, edge effect distribution from assigning 0 values to controls.


Modeling the complex gene x environment interplay in the simulated rheumatoid arthritis GAW15 data using latent variable structural equation modeling.

Nock NL, Larkin EK, Morris NJ, Li Y, Stein CM - BMC Proc (2007)

Evaluation of the GAW15 simulated rheumatoidarthritis (RA) model. Measurement model loadings depict relationships between observed variables (rectangles) and latent variables (ovals) and structural model path coefficients depict relationships between latent variables. Corresponding standard errors in parentheses are above single-headed arrows. Correlations are above double-headed arrows. Red arrows indicate the simulated locus location.*p-value ≤ 0.05.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2367478&req=5

Figure 1: Evaluation of the GAW15 simulated rheumatoidarthritis (RA) model. Measurement model loadings depict relationships between observed variables (rectangles) and latent variables (ovals) and structural model path coefficients depict relationships between latent variables. Corresponding standard errors in parentheses are above single-headed arrows. Correlations are above double-headed arrows. Red arrows indicate the simulated locus location.*p-value ≤ 0.05.
Mentions: We analyzed a "full" model with all genes (Gene A, C, D, E, and F), gender and smoking as covariates, and RA as a dichotomous outcome (Fig. 1) and obtained a good fitting model by CFI (0.96) but not RMSEA (0.12) or WRMR (6.53) fit index standards. To obtain convergence, we had to remove two SNPs (dSNP6_3918; dSNP6_3919) initially used in constructing latent variable Gene D. We could not determine the exact source of this problem but speculate it may have been due to some type of linear dependency between these SNPs because of the "weak" LD simulated between these loci. Nevertheless, removing the two SNPs did not alter the validity of Gene D (eigenvalue = 3.21; AVE = 0.59). The specific measurement model loadings and path coefficients for the "full" model are shown in Figure 1. The largest significant path coefficient between a gene construct and RA was observed with Gene C (β = -0.609 ± standard error of β = 0.020; p ≤ 0.05) and the inverse nature of this association may reflect the increased risk simulated with the wild-type "C" allele. Gene C was also highly correlated with DR (ρ = 0.895 ± 0.031), which was expected given that Locus C was simulated to be in complete LD with DR (D' = 1.0). We also found a strong positive path coefficient between Gene F and RA (β = 0.274 ± 0.033; p ≤ 0.05), which was expected because Locus F was simulated to confer risk from IgM on RA. However, the path coefficient between Gene D, which was simulated to be in "weak" LD with DR, and RA (β = 0.024 ± 0.027) was trivial in our model. We hypothesize the very low "D" risk allele frequency simulated contributed to this discrepancy. In addition, although the effects of DR were simulated to be controlled by Locus A, the path between Gene A and DR was negligible (β = 0.001 ± 0.022). This, however, may have been driven by our inability to devise a good construct for Locus A. When we added IgM (and/or anti-CCP), we did not obtain convergence, which was likely because of the skewed, edge effect distribution from assigning 0 values to controls.

Bottom Line: Our approach holds promise in unravelling complex diseases and improves upon current "one SNP (haplotype)-at-a-time" regression approaches by decreasing the number of statistical tests while minimizing problems with multicolinearity and haplotype estimation algorithm error.Furthermore, when genes are modeled as latent constructs simultaneously with other key cofactors, the approach provides enhanced control of confounding that should lead to less biased effect estimates among genes as well as between gene(s) and the complex disease.Moreover, because some a priori biological information is needed to form an initial substantive model, our approach may be most appropriate for candidate gene SNP panel applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Genetic and Molecular Epidemiology, Case Western Reserve University, Cleveland, OH 44106-7281, USA. nln@case.edu

ABSTRACT
Rheumatoid arthritis is a complex disease that appears to involve multiple genetic and environmental factors. Using the Genetic Analysis Workshop 15 simulated rheumatoid arthritis data and the structural equation modeling framework, we tested hypothesized "causal" rheumatoid arthritis model(s) by employing a novel latent gene construct approach that models individual genes as latent variables defined by multiple dense and non-dense single-nucleotide polymorphisms (SNPs). Our approach produced valid latent gene constructs, particularly with dense SNPs, which when coupled with other factors involved in rheumatoid arthritis, were able to generate good fitting models by certain goodness of fit indices. We observed that Gene F, C, DR, sex and smoking were significant predictors of rheumatoid arthritis but Genes A and E were not, which was generally, but not entirely, consistent with how the data were simulated. Our approach holds promise in unravelling complex diseases and improves upon current "one SNP (haplotype)-at-a-time" regression approaches by decreasing the number of statistical tests while minimizing problems with multicolinearity and haplotype estimation algorithm error. Furthermore, when genes are modeled as latent constructs simultaneously with other key cofactors, the approach provides enhanced control of confounding that should lead to less biased effect estimates among genes as well as between gene(s) and the complex disease. However, further study is needed to quantify bias, evaluate fit index disparity, and resolve multiplicative latent gene interactions. Moreover, because some a priori biological information is needed to form an initial substantive model, our approach may be most appropriate for candidate gene SNP panel applications.

No MeSH data available.


Related in: MedlinePlus