Limits...
Identification of type 2 diabetes-associated combination of SNPs using support vector machine.

Ban HJ, Heo JY, Oh KS, Park KJ - BMC Genet. (2010)

Bottom Line: Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association.However, estimating the effects of association is very difficult.Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Bio-Medical Informatics, Center for Genome Science, National Institute of Health, Korea Center for Disease Control and Prevention, 194, Tongil-Lo, Eunpyung-Gu, Seoul 122-701, Republic of Korea.

ABSTRACT

Background: Type 2 diabetes mellitus (T2D), a metabolic disorder characterized by insulin resistance and relative insulin deficiency, is a complex disease of major public health importance. Its incidence is rapidly increasing in the developed countries. Complex diseases are caused by interactions between multiple genes and environmental factors. Most association studies aim to identify individual susceptibility single markers using a simple disease model. Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association. However, estimating the effects of association is very difficult. We aim to assess the rules for classifying diseased and normal subjects by evaluating potential gene-gene interactions in the same or distinct biological pathways.

Results: We analyzed the importance of gene-gene interactions in T2D susceptibility by investigating 408 single nucleotide polymorphisms (SNPs) in 87 genes involved in major T2D-related pathways in 462 T2D patients and 456 healthy controls from the Korean cohort studies. We evaluated the support vector machine (SVM) method to differentiate between cases and controls using SNP information in a 10-fold cross-validation test. We achieved a 65.3% prediction rate with a combination of 14 SNPs in 12 genes by using the radial basis function (RBF)-kernel SVM. Similarly, we investigated subpopulation data sets of men and women and identified different SNP combinations with the prediction rates of 70.9% and 70.6%, respectively. As the high-throughput technology for genome-wide SNPs improves, it is likely that a much higher prediction rate with biologically more interesting combination of SNPs can be acquired by using this method.

Conclusions: Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.

Show MeSH

Related in: MedlinePlus

PPI network from the SNP combination of (a) total population set, (b) men sub-population set, and (c) women sub-population set. The largest PPI network at each population set was constructed using PPI information database http://genomenetwork.nig.ac.jp. Circles are included in the best combination of SNPs from SVM results and square proteins are included to construct circle-circle connected proteins network.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2875201&req=5

Figure 2: PPI network from the SNP combination of (a) total population set, (b) men sub-population set, and (c) women sub-population set. The largest PPI network at each population set was constructed using PPI information database http://genomenetwork.nig.ac.jp. Circles are included in the best combination of SNPs from SVM results and square proteins are included to construct circle-circle connected proteins network.

Mentions: On the basis of the results of the combinations of SNPs, we attempted to find any biological information; one of the results is the protein-protein interaction (PPI) network (Figure 2), which was constructed from the results of the combinations of SNPs. Each set of the SNP genotype data was not acquired from the fine mapping association study; therefore, direct SNP-SNP interaction or SNP analysis focused on each promoter SNP, intron SNP, or exon SNP is difficult. This is the reason why we carried out the analysis at the protein (gene) level in this research (not the SNP level).


Identification of type 2 diabetes-associated combination of SNPs using support vector machine.

Ban HJ, Heo JY, Oh KS, Park KJ - BMC Genet. (2010)

PPI network from the SNP combination of (a) total population set, (b) men sub-population set, and (c) women sub-population set. The largest PPI network at each population set was constructed using PPI information database http://genomenetwork.nig.ac.jp. Circles are included in the best combination of SNPs from SVM results and square proteins are included to construct circle-circle connected proteins network.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2875201&req=5

Figure 2: PPI network from the SNP combination of (a) total population set, (b) men sub-population set, and (c) women sub-population set. The largest PPI network at each population set was constructed using PPI information database http://genomenetwork.nig.ac.jp. Circles are included in the best combination of SNPs from SVM results and square proteins are included to construct circle-circle connected proteins network.
Mentions: On the basis of the results of the combinations of SNPs, we attempted to find any biological information; one of the results is the protein-protein interaction (PPI) network (Figure 2), which was constructed from the results of the combinations of SNPs. Each set of the SNP genotype data was not acquired from the fine mapping association study; therefore, direct SNP-SNP interaction or SNP analysis focused on each promoter SNP, intron SNP, or exon SNP is difficult. This is the reason why we carried out the analysis at the protein (gene) level in this research (not the SNP level).

Bottom Line: Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association.However, estimating the effects of association is very difficult.Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Bio-Medical Informatics, Center for Genome Science, National Institute of Health, Korea Center for Disease Control and Prevention, 194, Tongil-Lo, Eunpyung-Gu, Seoul 122-701, Republic of Korea.

ABSTRACT

Background: Type 2 diabetes mellitus (T2D), a metabolic disorder characterized by insulin resistance and relative insulin deficiency, is a complex disease of major public health importance. Its incidence is rapidly increasing in the developed countries. Complex diseases are caused by interactions between multiple genes and environmental factors. Most association studies aim to identify individual susceptibility single markers using a simple disease model. Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association. However, estimating the effects of association is very difficult. We aim to assess the rules for classifying diseased and normal subjects by evaluating potential gene-gene interactions in the same or distinct biological pathways.

Results: We analyzed the importance of gene-gene interactions in T2D susceptibility by investigating 408 single nucleotide polymorphisms (SNPs) in 87 genes involved in major T2D-related pathways in 462 T2D patients and 456 healthy controls from the Korean cohort studies. We evaluated the support vector machine (SVM) method to differentiate between cases and controls using SNP information in a 10-fold cross-validation test. We achieved a 65.3% prediction rate with a combination of 14 SNPs in 12 genes by using the radial basis function (RBF)-kernel SVM. Similarly, we investigated subpopulation data sets of men and women and identified different SNP combinations with the prediction rates of 70.9% and 70.6%, respectively. As the high-throughput technology for genome-wide SNPs improves, it is likely that a much higher prediction rate with biologically more interesting combination of SNPs can be acquired by using this method.

Conclusions: Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.

Show MeSH
Related in: MedlinePlus