Limits...
Unrecognized sequence homologies may confound genome-wide association studies.

Galichon P, Mesnard L, Hertig A, Stengel B, Rondeau E - Nucleic Acids Res. (2012)

Bottom Line: Genome-wide association studies (GWAS) have become a preferred method to identify new genetic susceptibility loci.Using genetic differences between male and female subjects as a model to study the effect of one specific genomic region on the whole SNP microarray, we provide strong evidence that the use of standard methods for GWAS can be misleading.We suggest a new systematic quality control step in the biological interpretation of previous and future GWAS.

View Article: PubMed Central - PubMed

Affiliation: INSERM UMR S702, Université Pierre et Marie Curie - Paris 6, 75006 Paris, France. galichon@orange.fr

ABSTRACT
Genome-wide association studies (GWAS) have become a preferred method to identify new genetic susceptibility loci. This technique aims to understanding the molecular etiology of common diseases, but in many cases, it has led to the identification of loci with no obvious biological relevance. Herein, we show that previously unrecognized sequence homologies have caused single-nucleotide polymorphism (SNP) microarrays to incorrectly associate a phenotype to a given locus when in fact the linkage is to another distant locus. Using genetic differences between male and female subjects as a model to study the effect of one specific genomic region on the whole SNP microarray, we provide strong evidence that the use of standard methods for GWAS can be misleading. We suggest a new systematic quality control step in the biological interpretation of previous and future GWAS.

Show MeSH

Related in: MedlinePlus

BLAST alignment analysis of the flanking sequence of a sex-associated SNP (rs12372818 on chromosome 13). Two homologous sequences are present on the Y chromosome (and one on chromosome 3). The presence of the ‘A’ variant on chromosome Y is responsible for a higher frequency of the minor allele in males.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3367202&req=5

gks169-F1: BLAST alignment analysis of the flanking sequence of a sex-associated SNP (rs12372818 on chromosome 13). Two homologous sequences are present on the Y chromosome (and one on chromosome 3). The presence of the ‘A’ variant on chromosome Y is responsible for a higher frequency of the minor allele in males.

Mentions: Because Mendelian principles of allelic transmission do not explain the association of autosomal loci with sex, we investigated whether nucleotide sequences on sex chromosomes could hybridize to the oligonucleotide probes of autosomal SNPs in various microarrays. Analysis of 28 of the SNP-flanking sequences (i.e. one for each autosomal locus we had found associated with sex in the first step) using the BLAST revealed that 21 of the 28 probes shared total or partial homology with sequences on the Y or X chromosome. All alignments of the SNP-flanking sequences and their locations on the genome can be found in Supplementary Data sets S1 and S2. Figure 1 shows the sequence alignment of a representative SNP-flanking sequence with an autosomal target sequence and with the homolog on a sex chromosome. We picked 28 random SNPs among those who were not found to be associated with sex and used them as control. The BLAST alignment showed that 26 of 28 SNPs had flanking sequences fully specific of their theoretical location, one had one homology on another autosome, and only 1 of 28 had many weak homologies on other chromosomes including chromosome X (Supplementary Data set S3). We then aligned all probes' flanking sequences from Data sets 1 and 2 on the chromosome X and Y sequence, and the association of autosomal SNPs with sex versus homologies on sex chromosomes is represented in Figure 2 and Supplementary Figure S1. When comparing Chi square statistics of probes with homologies versus probes with no homologies on sex chromosomes, we found that some groups of probes with high homologies on sex chromosomes had a significantly higher association to sex. Interestingly, in Data set 2 (Supplementary Figure S1), this was still true after exclusion of all SNPs showing an association to sex after Bonferroni correction.Figure 1.


Unrecognized sequence homologies may confound genome-wide association studies.

Galichon P, Mesnard L, Hertig A, Stengel B, Rondeau E - Nucleic Acids Res. (2012)

BLAST alignment analysis of the flanking sequence of a sex-associated SNP (rs12372818 on chromosome 13). Two homologous sequences are present on the Y chromosome (and one on chromosome 3). The presence of the ‘A’ variant on chromosome Y is responsible for a higher frequency of the minor allele in males.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3367202&req=5

gks169-F1: BLAST alignment analysis of the flanking sequence of a sex-associated SNP (rs12372818 on chromosome 13). Two homologous sequences are present on the Y chromosome (and one on chromosome 3). The presence of the ‘A’ variant on chromosome Y is responsible for a higher frequency of the minor allele in males.
Mentions: Because Mendelian principles of allelic transmission do not explain the association of autosomal loci with sex, we investigated whether nucleotide sequences on sex chromosomes could hybridize to the oligonucleotide probes of autosomal SNPs in various microarrays. Analysis of 28 of the SNP-flanking sequences (i.e. one for each autosomal locus we had found associated with sex in the first step) using the BLAST revealed that 21 of the 28 probes shared total or partial homology with sequences on the Y or X chromosome. All alignments of the SNP-flanking sequences and their locations on the genome can be found in Supplementary Data sets S1 and S2. Figure 1 shows the sequence alignment of a representative SNP-flanking sequence with an autosomal target sequence and with the homolog on a sex chromosome. We picked 28 random SNPs among those who were not found to be associated with sex and used them as control. The BLAST alignment showed that 26 of 28 SNPs had flanking sequences fully specific of their theoretical location, one had one homology on another autosome, and only 1 of 28 had many weak homologies on other chromosomes including chromosome X (Supplementary Data set S3). We then aligned all probes' flanking sequences from Data sets 1 and 2 on the chromosome X and Y sequence, and the association of autosomal SNPs with sex versus homologies on sex chromosomes is represented in Figure 2 and Supplementary Figure S1. When comparing Chi square statistics of probes with homologies versus probes with no homologies on sex chromosomes, we found that some groups of probes with high homologies on sex chromosomes had a significantly higher association to sex. Interestingly, in Data set 2 (Supplementary Figure S1), this was still true after exclusion of all SNPs showing an association to sex after Bonferroni correction.Figure 1.

Bottom Line: Genome-wide association studies (GWAS) have become a preferred method to identify new genetic susceptibility loci.Using genetic differences between male and female subjects as a model to study the effect of one specific genomic region on the whole SNP microarray, we provide strong evidence that the use of standard methods for GWAS can be misleading.We suggest a new systematic quality control step in the biological interpretation of previous and future GWAS.

View Article: PubMed Central - PubMed

Affiliation: INSERM UMR S702, Université Pierre et Marie Curie - Paris 6, 75006 Paris, France. galichon@orange.fr

ABSTRACT
Genome-wide association studies (GWAS) have become a preferred method to identify new genetic susceptibility loci. This technique aims to understanding the molecular etiology of common diseases, but in many cases, it has led to the identification of loci with no obvious biological relevance. Herein, we show that previously unrecognized sequence homologies have caused single-nucleotide polymorphism (SNP) microarrays to incorrectly associate a phenotype to a given locus when in fact the linkage is to another distant locus. Using genetic differences between male and female subjects as a model to study the effect of one specific genomic region on the whole SNP microarray, we provide strong evidence that the use of standard methods for GWAS can be misleading. We suggest a new systematic quality control step in the biological interpretation of previous and future GWAS.

Show MeSH
Related in: MedlinePlus