Limits...
Non-synonymous and synonymous coding SNPs show similar likelihood and effect size of human disease association.

Chen R, Davydov EV, Sirota M, Butte AJ - PLoS ONE (2010)

Bottom Line: We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association.The enrichment of disease-associated SNPs around the 80(th) base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies.We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3' untranslated regions, such as the microRNA binding sites, might be under-investigated.

View Article: PubMed Central - PubMed

Affiliation: Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States of America. rchen1@stanford.edu

ABSTRACT
Many DNA variants have been identified on more than 300 diseases and traits using Genome-Wide Association Studies (GWASs). Some have been validated using deep sequencing, but many fewer have been validated functionally, primarily focused on non-synonymous coding SNPs (nsSNPs). It is an open question whether synonymous coding SNPs (sSNPs) and other non-coding SNPs can lead to as high odds ratios as nsSNPs. We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association. The enrichment of disease-associated SNPs around the 80(th) base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies. We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3' untranslated regions, such as the microRNA binding sites, might be under-investigated. Our results suggest that sSNPs are just as likely to be involved in disease mechanisms, so we recommend that sSNPs discovered from GWAS should also be examined with functional studies.

Show MeSH
A curated quantitative disease-SNP association database.Starting from a list of all SNPs measured in the HapMap 3 project, we searched for their presence in all Medline abstracts, eliminating non-human studies. Significant SNP-disease associations were manually curated from the full text, and reviewed four rounds. SNP IDs were annotated using the UCSC genome browser for positions and function types and annotated using Entrez for associated genes. Disease mesh terms were compared with the Unified Medical Language System (UMLS) to select concept unique identifiers (CUIs).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2962641&req=5

pone-0013574-g001: A curated quantitative disease-SNP association database.Starting from a list of all SNPs measured in the HapMap 3 project, we searched for their presence in all Medline abstracts, eliminating non-human studies. Significant SNP-disease associations were manually curated from the full text, and reviewed four rounds. SNP IDs were annotated using the UCSC genome browser for positions and function types and annotated using Entrez for associated genes. Disease mesh terms were compared with the Unified Medical Language System (UMLS) to select concept unique identifiers (CUIs).

Mentions: As previously described [7], starting from a list of Medline abstracts that contains a dbSNP ID measured in the HapMap 3 projects[8], we manually curated 2,113 publications, and recorded more than 100 features of the disease-SNP associations, including the disease name (e.g. coronary artery disease), specific phenotype (e.g. acute coronary syndrome in coronary artery disease), study population(e.g. Portuguese), case and control population (Coronary artery disease patients vs. healthy patients), genotyping technology, major/minor alleles, odds ratio, 95% confidence interval of the odds ratio, published p-value, and genetic model (Fig. 1). By categorizing studies based on similar diseases, we manually extracted the disease Medical Subject Heading (MESH) terms [9], and mapped them to the Concept Unique Identifiers (CUI) in the Unified Medical Language System (ULMS) [10] to standardize disease names. We then annotated all SNPs using the UCSC Genome Browser and NCBI Entrez to retrieve the chromosome locations, functional types, and associated genes for each.


Non-synonymous and synonymous coding SNPs show similar likelihood and effect size of human disease association.

Chen R, Davydov EV, Sirota M, Butte AJ - PLoS ONE (2010)

A curated quantitative disease-SNP association database.Starting from a list of all SNPs measured in the HapMap 3 project, we searched for their presence in all Medline abstracts, eliminating non-human studies. Significant SNP-disease associations were manually curated from the full text, and reviewed four rounds. SNP IDs were annotated using the UCSC genome browser for positions and function types and annotated using Entrez for associated genes. Disease mesh terms were compared with the Unified Medical Language System (UMLS) to select concept unique identifiers (CUIs).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2962641&req=5

pone-0013574-g001: A curated quantitative disease-SNP association database.Starting from a list of all SNPs measured in the HapMap 3 project, we searched for their presence in all Medline abstracts, eliminating non-human studies. Significant SNP-disease associations were manually curated from the full text, and reviewed four rounds. SNP IDs were annotated using the UCSC genome browser for positions and function types and annotated using Entrez for associated genes. Disease mesh terms were compared with the Unified Medical Language System (UMLS) to select concept unique identifiers (CUIs).
Mentions: As previously described [7], starting from a list of Medline abstracts that contains a dbSNP ID measured in the HapMap 3 projects[8], we manually curated 2,113 publications, and recorded more than 100 features of the disease-SNP associations, including the disease name (e.g. coronary artery disease), specific phenotype (e.g. acute coronary syndrome in coronary artery disease), study population(e.g. Portuguese), case and control population (Coronary artery disease patients vs. healthy patients), genotyping technology, major/minor alleles, odds ratio, 95% confidence interval of the odds ratio, published p-value, and genetic model (Fig. 1). By categorizing studies based on similar diseases, we manually extracted the disease Medical Subject Heading (MESH) terms [9], and mapped them to the Concept Unique Identifiers (CUI) in the Unified Medical Language System (ULMS) [10] to standardize disease names. We then annotated all SNPs using the UCSC Genome Browser and NCBI Entrez to retrieve the chromosome locations, functional types, and associated genes for each.

Bottom Line: We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association.The enrichment of disease-associated SNPs around the 80(th) base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies.We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3' untranslated regions, such as the microRNA binding sites, might be under-investigated.

View Article: PubMed Central - PubMed

Affiliation: Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States of America. rchen1@stanford.edu

ABSTRACT
Many DNA variants have been identified on more than 300 diseases and traits using Genome-Wide Association Studies (GWASs). Some have been validated using deep sequencing, but many fewer have been validated functionally, primarily focused on non-synonymous coding SNPs (nsSNPs). It is an open question whether synonymous coding SNPs (sSNPs) and other non-coding SNPs can lead to as high odds ratios as nsSNPs. We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association. The enrichment of disease-associated SNPs around the 80(th) base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies. We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3' untranslated regions, such as the microRNA binding sites, might be under-investigated. Our results suggest that sSNPs are just as likely to be involved in disease mechanisms, so we recommend that sSNPs discovered from GWAS should also be examined with functional studies.

Show MeSH