Limits...
Constructing gene association networks for rheumatoid arthritis using the backward genotype-trait association (BGTA) algorithm.

Ding Y, Cong L, Ionita-Laza I, Lo SH, Zheng T - BMC Proc (2007)

Bottom Line: For the candidate genes, we found strong signals for PTPN22 and SUMO4.Based on significant association evidence, we built an association network among the loci of PTPN22, PADI4, DLG5, SLC22A4, SUMO4, and CARD15.Using the BGTA algorithm, we identified genetic loci and candidate genes that were associated with RA susceptibility and association networks among them.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics, Columbia University, New York, New York 10027, USA. yding@stat.columbia.edu

ABSTRACT

Background: Rheumatoid arthritis (RA, MIM 180300) is a common and complex inflammatory disorder. The North American Rheumatoid Arthritis Consortium (NARAC) data, as part of the Genetic Analysis Workshop 15 data, consists of both genome scan and candidate gene studies on RA patients.

Results: We applied the backward genotype-trait association (BGTA) algorithm to capture marginal and gene x gene interaction effects of multiple susceptibility loci on RA disease status. A two-stage screening approach was used for the genome scan, whereas a comprehensive study of all possible subsets was conducted for the candidate genes. For the genome scan, we constructed an association network among 39 genetic loci that demonstrated strong signals, 19 of which have been reported in the RA literature. For the candidate genes, we found strong signals for PTPN22 and SUMO4. Based on significant association evidence, we built an association network among the loci of PTPN22, PADI4, DLG5, SLC22A4, SUMO4, and CARD15. To control for false positives, we used permutation tests to constrain the family-wise type I error rate to 1%.

Conclusion: Using the BGTA algorithm, we identified genetic loci and candidate genes that were associated with RA susceptibility and association networks among them. For the first time, we report possible interactions between single-nucleotide polymorphisms/genes, which may be useful for biological interpretation.

No MeSH data available.


Related in: MedlinePlus

Evaluation of significance of in the RA candidategene study. For subsets of a given size, we plotted the highest GTD score (red solid line) with a 95% confidence interval (with Bonferroni correction for 220 - 1 multiple comparisons). The top GTD scores for sets of two to eight marker sets were significantly higher than the expected value under the  hypothesis (black dotted line at 1). Based on 100 permuted data sets, at different subset sizes, the black solid line displays the median maximum GTD scores and the vertical bars are the 95% confidence interval of permutations. For each permutation, we also calculated Bonferroni-corrected 95% confidence intervals for the maximum GTD scores. The blue shading indicates the coverage of these 100 confidence intervals at each subset size (the darkest being 0.9 to 1 (or 90 to 100%) and the lightest being 0.0 to 0.1 (or 0 to 10%)).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2367461&req=5

Figure 2: Evaluation of significance of in the RA candidategene study. For subsets of a given size, we plotted the highest GTD score (red solid line) with a 95% confidence interval (with Bonferroni correction for 220 - 1 multiple comparisons). The top GTD scores for sets of two to eight marker sets were significantly higher than the expected value under the hypothesis (black dotted line at 1). Based on 100 permuted data sets, at different subset sizes, the black solid line displays the median maximum GTD scores and the vertical bars are the 95% confidence interval of permutations. For each permutation, we also calculated Bonferroni-corrected 95% confidence intervals for the maximum GTD scores. The blue shading indicates the coverage of these 100 confidence intervals at each subset size (the darkest being 0.9 to 1 (or 90 to 100%) and the lightest being 0.0 to 0.1 (or 0 to 10%)).

Mentions: To estimate the distribution of GTD scores of local optimal BGTA-irreducible SNP clusters under the hypothesis that no SNP is associated with the trait, we permuted the labels of disease status to create a simulated data set. For the two-stage study, we examined the GTD score density of returned clusters in the second stage from complete two-stage analyses of 50 permuted data sets. False discovery rate (FDR) was controlled by comparing the observed real density and the density under the hypothesis (see [7-9]). For the candidate gene study, we set the selection threshold for each SNP subset size to be the maximum among 100 permuted replicates (the red dotted line in Figure 2) and identified local optimal BGTA-irreducible SNP subsets as significant at the 1% family-wise level. The relatively small number of permutations was due to the high computational burden of the analytical approach. This would result in non-trivial margin of error for the significance levels evaluated, which needs to be taken into consideration when interpreting the results.


Constructing gene association networks for rheumatoid arthritis using the backward genotype-trait association (BGTA) algorithm.

Ding Y, Cong L, Ionita-Laza I, Lo SH, Zheng T - BMC Proc (2007)

Evaluation of significance of in the RA candidategene study. For subsets of a given size, we plotted the highest GTD score (red solid line) with a 95% confidence interval (with Bonferroni correction for 220 - 1 multiple comparisons). The top GTD scores for sets of two to eight marker sets were significantly higher than the expected value under the  hypothesis (black dotted line at 1). Based on 100 permuted data sets, at different subset sizes, the black solid line displays the median maximum GTD scores and the vertical bars are the 95% confidence interval of permutations. For each permutation, we also calculated Bonferroni-corrected 95% confidence intervals for the maximum GTD scores. The blue shading indicates the coverage of these 100 confidence intervals at each subset size (the darkest being 0.9 to 1 (or 90 to 100%) and the lightest being 0.0 to 0.1 (or 0 to 10%)).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2367461&req=5

Figure 2: Evaluation of significance of in the RA candidategene study. For subsets of a given size, we plotted the highest GTD score (red solid line) with a 95% confidence interval (with Bonferroni correction for 220 - 1 multiple comparisons). The top GTD scores for sets of two to eight marker sets were significantly higher than the expected value under the hypothesis (black dotted line at 1). Based on 100 permuted data sets, at different subset sizes, the black solid line displays the median maximum GTD scores and the vertical bars are the 95% confidence interval of permutations. For each permutation, we also calculated Bonferroni-corrected 95% confidence intervals for the maximum GTD scores. The blue shading indicates the coverage of these 100 confidence intervals at each subset size (the darkest being 0.9 to 1 (or 90 to 100%) and the lightest being 0.0 to 0.1 (or 0 to 10%)).
Mentions: To estimate the distribution of GTD scores of local optimal BGTA-irreducible SNP clusters under the hypothesis that no SNP is associated with the trait, we permuted the labels of disease status to create a simulated data set. For the two-stage study, we examined the GTD score density of returned clusters in the second stage from complete two-stage analyses of 50 permuted data sets. False discovery rate (FDR) was controlled by comparing the observed real density and the density under the hypothesis (see [7-9]). For the candidate gene study, we set the selection threshold for each SNP subset size to be the maximum among 100 permuted replicates (the red dotted line in Figure 2) and identified local optimal BGTA-irreducible SNP subsets as significant at the 1% family-wise level. The relatively small number of permutations was due to the high computational burden of the analytical approach. This would result in non-trivial margin of error for the significance levels evaluated, which needs to be taken into consideration when interpreting the results.

Bottom Line: For the candidate genes, we found strong signals for PTPN22 and SUMO4.Based on significant association evidence, we built an association network among the loci of PTPN22, PADI4, DLG5, SLC22A4, SUMO4, and CARD15.Using the BGTA algorithm, we identified genetic loci and candidate genes that were associated with RA susceptibility and association networks among them.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics, Columbia University, New York, New York 10027, USA. yding@stat.columbia.edu

ABSTRACT

Background: Rheumatoid arthritis (RA, MIM 180300) is a common and complex inflammatory disorder. The North American Rheumatoid Arthritis Consortium (NARAC) data, as part of the Genetic Analysis Workshop 15 data, consists of both genome scan and candidate gene studies on RA patients.

Results: We applied the backward genotype-trait association (BGTA) algorithm to capture marginal and gene x gene interaction effects of multiple susceptibility loci on RA disease status. A two-stage screening approach was used for the genome scan, whereas a comprehensive study of all possible subsets was conducted for the candidate genes. For the genome scan, we constructed an association network among 39 genetic loci that demonstrated strong signals, 19 of which have been reported in the RA literature. For the candidate genes, we found strong signals for PTPN22 and SUMO4. Based on significant association evidence, we built an association network among the loci of PTPN22, PADI4, DLG5, SLC22A4, SUMO4, and CARD15. To control for false positives, we used permutation tests to constrain the family-wise type I error rate to 1%.

Conclusion: Using the BGTA algorithm, we identified genetic loci and candidate genes that were associated with RA susceptibility and association networks among them. For the first time, we report possible interactions between single-nucleotide polymorphisms/genes, which may be useful for biological interpretation.

No MeSH data available.


Related in: MedlinePlus