Limits...
Fast set-based association analysis using summary data from GWAS identifies novel gene loci for human complex traits

View Article: PubMed Central - PubMed

ABSTRACT

We propose a method (fastBAT) that performs a fast set-based association analysis for human complex traits using summary-level data from genome-wide association studies (GWAS) and linkage disequilibrium (LD) data from a reference sample with individual-level genotypes. We demonstrate using simulations and analyses of real datasets that fastBAT is more accurate and orders of magnitude faster than the prevailing methods. Using fastBAT, we analyze summary data from the latest meta-analyses of GWAS on 150,064–339,224 individuals for height, body mass index (BMI), and schizophrenia. We identify 6 novel gene loci for height, 2 for BMI, and 3 for schizophrenia at PfastBAT < 5 × 10−8. The gain of power is due to multiple small independent association signals at these loci (e.g. the THRB and FOXP1 loci for schizophrenia). The method is general and can be applied to GWAS data for all complex traits and diseases in humans and to such data in other species.

No MeSH data available.


Related in: MedlinePlus

Conditional association analysis at the novel gene loci for height, BMI and schizophrenia.In the analysis of the latest GWAS data for height, BMI and schizophrenia, there are 5 genes loci at which the signal from fastBAT is orders of magnitude higher than the top associated SNP from GWAS, suggesting there are multiple independent causal variants in these regions. Shown are the results from the GCTA condition analysis using the 1KGP-imputed HRS data as the reference for LD estimation (Methods). Shown in red are the original GWAS results, in green are the results from the conditional analysis conditioning on the top SNP (labeled as ‘Cond round 1’), and in blue are the results from the conditional analysis conditioning on the top two independent signals (‘Cond round 2’). The blue horizontal line represents the threshold p-value of 5 × 10−8.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5015118&req=5

f3: Conditional association analysis at the novel gene loci for height, BMI and schizophrenia.In the analysis of the latest GWAS data for height, BMI and schizophrenia, there are 5 genes loci at which the signal from fastBAT is orders of magnitude higher than the top associated SNP from GWAS, suggesting there are multiple independent causal variants in these regions. Shown are the results from the GCTA condition analysis using the 1KGP-imputed HRS data as the reference for LD estimation (Methods). Shown in red are the original GWAS results, in green are the results from the conditional analysis conditioning on the top SNP (labeled as ‘Cond round 1’), and in blue are the results from the conditional analysis conditioning on the top two independent signals (‘Cond round 2’). The blue horizontal line represents the threshold p-value of 5 × 10−8.

Mentions: Of the novel genes identified by fastBAT using the latest GWAS data, 6 genes for height, 2 for BMI, and 3 for abbreviated SCZ passed the commonly used GWAS threshold p-value (i.e. P < 5 × 10−8). While a few of these results are likely due to sampling, e.g. PfastBAT just passed the threshold whereas PGWAS of the top associated SNP was slightly below the threshold, there were genes for which PfastBAT was orders of magnitude smaller than PGWAS of the top associated SNP. These include THRB and FOXP1 for height, SCAMP4 for BMI, and FOXP1 and ZNF365 for SCZ. We have shown by simulations above that if there is only one causal variant at a locus, PfastBAT is expected to be larger than PGWAS of the top associated SNP (Supplementary Fig. 1). Hence, the gain of power for fastBAT at these 5 loci is likely due to multiple signals. We therefore performed GCTA-COJO conditional analysis in these 5 gene regions (Methods). We found that there was at least a secondary (but not genome-wide significant) signal conditioning on the top associated SNP in each of these regions (Fig. 3). Interestingly, FOXP1 was associated with both height (PfastBAT = 1.7 × 10−9) and SCZ (PfastBAT = 3.8 × 10−12), consistent with previous evidence that de novo mutations in FOXP1 cause intellectual disability, autism, and language impairment in humans18, that increased gene expression level of FOXP1 in autism patients19, and that Foxp1 deletion impairs neuronal development and causes autistic-like behaviour in mice20.


Fast set-based association analysis using summary data from GWAS identifies novel gene loci for human complex traits
Conditional association analysis at the novel gene loci for height, BMI and schizophrenia.In the analysis of the latest GWAS data for height, BMI and schizophrenia, there are 5 genes loci at which the signal from fastBAT is orders of magnitude higher than the top associated SNP from GWAS, suggesting there are multiple independent causal variants in these regions. Shown are the results from the GCTA condition analysis using the 1KGP-imputed HRS data as the reference for LD estimation (Methods). Shown in red are the original GWAS results, in green are the results from the conditional analysis conditioning on the top SNP (labeled as ‘Cond round 1’), and in blue are the results from the conditional analysis conditioning on the top two independent signals (‘Cond round 2’). The blue horizontal line represents the threshold p-value of 5 × 10−8.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5015118&req=5

f3: Conditional association analysis at the novel gene loci for height, BMI and schizophrenia.In the analysis of the latest GWAS data for height, BMI and schizophrenia, there are 5 genes loci at which the signal from fastBAT is orders of magnitude higher than the top associated SNP from GWAS, suggesting there are multiple independent causal variants in these regions. Shown are the results from the GCTA condition analysis using the 1KGP-imputed HRS data as the reference for LD estimation (Methods). Shown in red are the original GWAS results, in green are the results from the conditional analysis conditioning on the top SNP (labeled as ‘Cond round 1’), and in blue are the results from the conditional analysis conditioning on the top two independent signals (‘Cond round 2’). The blue horizontal line represents the threshold p-value of 5 × 10−8.
Mentions: Of the novel genes identified by fastBAT using the latest GWAS data, 6 genes for height, 2 for BMI, and 3 for abbreviated SCZ passed the commonly used GWAS threshold p-value (i.e. P < 5 × 10−8). While a few of these results are likely due to sampling, e.g. PfastBAT just passed the threshold whereas PGWAS of the top associated SNP was slightly below the threshold, there were genes for which PfastBAT was orders of magnitude smaller than PGWAS of the top associated SNP. These include THRB and FOXP1 for height, SCAMP4 for BMI, and FOXP1 and ZNF365 for SCZ. We have shown by simulations above that if there is only one causal variant at a locus, PfastBAT is expected to be larger than PGWAS of the top associated SNP (Supplementary Fig. 1). Hence, the gain of power for fastBAT at these 5 loci is likely due to multiple signals. We therefore performed GCTA-COJO conditional analysis in these 5 gene regions (Methods). We found that there was at least a secondary (but not genome-wide significant) signal conditioning on the top associated SNP in each of these regions (Fig. 3). Interestingly, FOXP1 was associated with both height (PfastBAT = 1.7 × 10−9) and SCZ (PfastBAT = 3.8 × 10−12), consistent with previous evidence that de novo mutations in FOXP1 cause intellectual disability, autism, and language impairment in humans18, that increased gene expression level of FOXP1 in autism patients19, and that Foxp1 deletion impairs neuronal development and causes autistic-like behaviour in mice20.

View Article: PubMed Central - PubMed

ABSTRACT

We propose a method (fastBAT) that performs a fast set-based association analysis for human complex traits using summary-level data from genome-wide association studies (GWAS) and linkage disequilibrium (LD) data from a reference sample with individual-level genotypes. We demonstrate using simulations and analyses of real datasets that fastBAT is more accurate and orders of magnitude faster than the prevailing methods. Using fastBAT, we analyze summary data from the latest meta-analyses of GWAS on 150,064&ndash;339,224 individuals for height, body mass index (BMI), and schizophrenia. We identify 6 novel gene loci for height, 2 for BMI, and 3 for schizophrenia at PfastBAT&thinsp;&lt;&thinsp;5&thinsp;&times;&thinsp;10&minus;8. The gain of power is due to multiple small independent association signals at these loci (e.g. the THRB and FOXP1 loci for schizophrenia). The method is general and can be applied to GWAS data for all complex traits and diseases in humans and to such data in other species.

No MeSH data available.


Related in: MedlinePlus