Limits...
Genotype-based test in mapping cis-regulatory variants from allele-specific expression data.

Lefebvre JF, Vello E, Ge B, Montgomery SB, Dermitzakis ET, Pastinen T, Labuda D - PLoS ONE (2012)

Bottom Line: In this study, we introduce a genotype-based mapping test that does not require haplotype-phase inference to locate regulatory regions.The genotype-based test performed equally well with the experimental AI datasets, either from genome-wide cDNA hybridization arrays or from RNA sequencing.By avoiding the need of haplotype inference, the genotype-based test will suit AI analyses in population samples of unknown haplotype structure and will additionally facilitate the identification of cis-regulatory variants that are located far away from the regulated transcript.

View Article: PubMed Central - PubMed

Affiliation: Centre de Recherche du CHU Sainte-Justine, Université de Montréal, Montréal, Québec, Canada.

ABSTRACT
Identifying and understanding the impact of gene regulatory variation is of considerable importance in evolutionary and medical genetics; such variants are thought to be responsible for human-specific adaptation and to have an important role in genetic disease. Regulatory variation in cis is readily detected in individuals showing uneven expression of a transcript from its two allelic copies, an observation referred to as allelic imbalance (AI). Identifying individuals exhibiting AI allows mapping of regulatory DNA regions and the potential to identify the underlying causal genetic variant(s). However, existing mapping methods require knowledge of the haplotypes, which make them sensitive to phasing errors. In this study, we introduce a genotype-based mapping test that does not require haplotype-phase inference to locate regulatory regions. The test relies on partitioning genotypes of individuals exhibiting AI and those not expressing AI in a 2×3 contingency table. The performance of this test to detect linkage disequilibrium (LD) between a potential regulatory site and a SNP located in this region was examined by analyzing the simulated and the empirical AI datasets. In simulation experiments, the genotype-based test outperforms the haplotype-based tests with the increasing distance separating the regulatory region from its regulated transcript. The genotype-based test performed equally well with the experimental AI datasets, either from genome-wide cDNA hybridization arrays or from RNA sequencing. By avoiding the need of haplotype inference, the genotype-based test will suit AI analyses in population samples of unknown haplotype structure and will additionally facilitate the identification of cis-regulatory variants that are located far away from the regulated transcript.

Show MeSH

Related in: MedlinePlus

Sets of possible genotypes under complete and incomplete linkage disequilibrium.Under complete LD for genealogical positions below (A), parallel (B) and above (C), there are always two genotypes characterizing AI-individuals and only one type of A-site homozygote present (AA or aa). Under equilibrium or incomplete linkage disequilibrium (D) all four haplotypes involving R and A sites are present and thus potentially all ten resulting genotypes as well.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3369843&req=5

pone-0038667-g003: Sets of possible genotypes under complete and incomplete linkage disequilibrium.Under complete LD for genealogical positions below (A), parallel (B) and above (C), there are always two genotypes characterizing AI-individuals and only one type of A-site homozygote present (AA or aa). Under equilibrium or incomplete linkage disequilibrium (D) all four haplotypes involving R and A sites are present and thus potentially all ten resulting genotypes as well.

Mentions: When site R and any of the tested SNPs (A sites) are unlinked, their respective alleles will segregate randomly. In contrast, SNPs located in the vicinity of the R site, in the absence of recombination, i.e. at complete LD between these two sites, co-segregate in a characteristic fashion. With two bi-allelic sites, there are four possible mutation histories, each one leading to a characteristic haplotype trio, i.e. to a combination of three possible haplotypes depending on the tree genealogy (Figure 2). The sites are referred to be in “parallel” position when a and r mutations originate on different branches; then both derived alleles, a and r, will occur on different haplotypes. The A site mutation and the R site mutation sequentially occurred on the same branch of the genealogy, with A site mutating first (thus referred to as “above”) or second (“below”). Mutation histories are mutually exclusive, yet histories 2 and 3, when the sites are in parallel position, are indistinguishable at the level of haplotype trios (Figure 2). From each haplotype trio, six different sets of diploid genotypes involving two bi-allelic sites, A and R, can potentially arise (Figure 3). In each set we find two genotypes representing Rr individuals that express AI phenotype. Importantly, in each of these sets the distribution of the A site genotypes differ between AI expressing individuals (Rr) and non-AI individuals (RR and rr).


Genotype-based test in mapping cis-regulatory variants from allele-specific expression data.

Lefebvre JF, Vello E, Ge B, Montgomery SB, Dermitzakis ET, Pastinen T, Labuda D - PLoS ONE (2012)

Sets of possible genotypes under complete and incomplete linkage disequilibrium.Under complete LD for genealogical positions below (A), parallel (B) and above (C), there are always two genotypes characterizing AI-individuals and only one type of A-site homozygote present (AA or aa). Under equilibrium or incomplete linkage disequilibrium (D) all four haplotypes involving R and A sites are present and thus potentially all ten resulting genotypes as well.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3369843&req=5

pone-0038667-g003: Sets of possible genotypes under complete and incomplete linkage disequilibrium.Under complete LD for genealogical positions below (A), parallel (B) and above (C), there are always two genotypes characterizing AI-individuals and only one type of A-site homozygote present (AA or aa). Under equilibrium or incomplete linkage disequilibrium (D) all four haplotypes involving R and A sites are present and thus potentially all ten resulting genotypes as well.
Mentions: When site R and any of the tested SNPs (A sites) are unlinked, their respective alleles will segregate randomly. In contrast, SNPs located in the vicinity of the R site, in the absence of recombination, i.e. at complete LD between these two sites, co-segregate in a characteristic fashion. With two bi-allelic sites, there are four possible mutation histories, each one leading to a characteristic haplotype trio, i.e. to a combination of three possible haplotypes depending on the tree genealogy (Figure 2). The sites are referred to be in “parallel” position when a and r mutations originate on different branches; then both derived alleles, a and r, will occur on different haplotypes. The A site mutation and the R site mutation sequentially occurred on the same branch of the genealogy, with A site mutating first (thus referred to as “above”) or second (“below”). Mutation histories are mutually exclusive, yet histories 2 and 3, when the sites are in parallel position, are indistinguishable at the level of haplotype trios (Figure 2). From each haplotype trio, six different sets of diploid genotypes involving two bi-allelic sites, A and R, can potentially arise (Figure 3). In each set we find two genotypes representing Rr individuals that express AI phenotype. Importantly, in each of these sets the distribution of the A site genotypes differ between AI expressing individuals (Rr) and non-AI individuals (RR and rr).

Bottom Line: In this study, we introduce a genotype-based mapping test that does not require haplotype-phase inference to locate regulatory regions.The genotype-based test performed equally well with the experimental AI datasets, either from genome-wide cDNA hybridization arrays or from RNA sequencing.By avoiding the need of haplotype inference, the genotype-based test will suit AI analyses in population samples of unknown haplotype structure and will additionally facilitate the identification of cis-regulatory variants that are located far away from the regulated transcript.

View Article: PubMed Central - PubMed

Affiliation: Centre de Recherche du CHU Sainte-Justine, Université de Montréal, Montréal, Québec, Canada.

ABSTRACT
Identifying and understanding the impact of gene regulatory variation is of considerable importance in evolutionary and medical genetics; such variants are thought to be responsible for human-specific adaptation and to have an important role in genetic disease. Regulatory variation in cis is readily detected in individuals showing uneven expression of a transcript from its two allelic copies, an observation referred to as allelic imbalance (AI). Identifying individuals exhibiting AI allows mapping of regulatory DNA regions and the potential to identify the underlying causal genetic variant(s). However, existing mapping methods require knowledge of the haplotypes, which make them sensitive to phasing errors. In this study, we introduce a genotype-based mapping test that does not require haplotype-phase inference to locate regulatory regions. The test relies on partitioning genotypes of individuals exhibiting AI and those not expressing AI in a 2×3 contingency table. The performance of this test to detect linkage disequilibrium (LD) between a potential regulatory site and a SNP located in this region was examined by analyzing the simulated and the empirical AI datasets. In simulation experiments, the genotype-based test outperforms the haplotype-based tests with the increasing distance separating the regulatory region from its regulated transcript. The genotype-based test performed equally well with the experimental AI datasets, either from genome-wide cDNA hybridization arrays or from RNA sequencing. By avoiding the need of haplotype inference, the genotype-based test will suit AI analyses in population samples of unknown haplotype structure and will additionally facilitate the identification of cis-regulatory variants that are located far away from the regulated transcript.

Show MeSH
Related in: MedlinePlus