Limits...
A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.

Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M - Nat Commun (2016)

Bottom Line: Here, we provide insights into the functional effect of these variants using allele-specific behaviour.Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'.Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.

ABSTRACT
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

No MeSH data available.


Related in: MedlinePlus

A considerable fraction of allele-specific variants are rare but do not form the majority.A lower proportion of allele-specific SNVs than non-allele-specific SNVs are rare, suggesting less selective constraints in allele-specific SNVs. The MAF spectra of ASB (green filled circle), control non-ASB SNVs (green open circle), ASE (blue filled circle) and control non-ASE SNVs (blue open circle) are plotted at a bin size of 100. The peaks are in the bin for MAF≤0.5% (corresponding to 0.005 in Figure 6). The inset zooms in on the histogram at MAF≤2.5%. The proportion of rare variants in descending order: ASE−>ASE+>ASB+>ASB−. Comparing ASE+ with ASE− gives an odds ratio of 0.3 (Bonferroni-corrected hypergeometric P<2.2e−16), while comparing ASB+ with ASB−, gives an odds ratio of 1.3 (P=0.2), signifying statistically significant depletion of ASE SNVs but statistically insignificant enrichment of ASB SNVs relative to the respective non-allele-specific control SNVs. Statistically significant depletion in ASE suggests that ASE SNVs are under less purifying selection.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4837449&req=5

f6: A considerable fraction of allele-specific variants are rare but do not form the majority.A lower proportion of allele-specific SNVs than non-allele-specific SNVs are rare, suggesting less selective constraints in allele-specific SNVs. The MAF spectra of ASB (green filled circle), control non-ASB SNVs (green open circle), ASE (blue filled circle) and control non-ASE SNVs (blue open circle) are plotted at a bin size of 100. The peaks are in the bin for MAF≤0.5% (corresponding to 0.005 in Figure 6). The inset zooms in on the histogram at MAF≤2.5%. The proportion of rare variants in descending order: ASE−>ASE+>ASB+>ASB−. Comparing ASE+ with ASE− gives an odds ratio of 0.3 (Bonferroni-corrected hypergeometric P<2.2e−16), while comparing ASB+ with ASB−, gives an odds ratio of 1.3 (P=0.2), signifying statistically significant depletion of ASE SNVs but statistically insignificant enrichment of ASB SNVs relative to the respective non-allele-specific control SNVs. Statistically significant depletion in ASE suggests that ASE SNVs are under less purifying selection.

Mentions: To examine selective constraints in allele-specific SNVs, we then consider the enrichment of rare variants with MAF≤0.5% (refs 3, 28). Figure 6 shows a shift of the allele frequency spectrum towards very low allele frequencies in all allele-specific and non-allele-specific SNVs, peaking at MAF≤0.5%. We limit our analyses for ASE SNVs to only those found in CDS regions and ASB SNVs to only those found within known TF motifs (among the 708 non-coding categories in Supplementary Data 6). In general, ASE SNVs are shown to have a greater enrichment of rare variants than ASB SNVs. This is probably due to the background of ASE SNVs being in genes versus ASB SNVs mostly in non-coding regions of the genome. Our results in Fig. 6 show a statistically significant lower enrichment of rare variants in ASE SNVs as compared with non-ASE SNVs (Fisher's exact test odds ratio=0.3, P<2.2e−16), but statistically insignificant higher enrichment of rare variants in ASB SNVs than non-ASB SNVs (Fisher's exact test odds ratio=1.3, P=0.2). This observation suggests that ASE variants may be under weaker selection than non-ASE variants.


A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.

Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M - Nat Commun (2016)

A considerable fraction of allele-specific variants are rare but do not form the majority.A lower proportion of allele-specific SNVs than non-allele-specific SNVs are rare, suggesting less selective constraints in allele-specific SNVs. The MAF spectra of ASB (green filled circle), control non-ASB SNVs (green open circle), ASE (blue filled circle) and control non-ASE SNVs (blue open circle) are plotted at a bin size of 100. The peaks are in the bin for MAF≤0.5% (corresponding to 0.005 in Figure 6). The inset zooms in on the histogram at MAF≤2.5%. The proportion of rare variants in descending order: ASE−>ASE+>ASB+>ASB−. Comparing ASE+ with ASE− gives an odds ratio of 0.3 (Bonferroni-corrected hypergeometric P<2.2e−16), while comparing ASB+ with ASB−, gives an odds ratio of 1.3 (P=0.2), signifying statistically significant depletion of ASE SNVs but statistically insignificant enrichment of ASB SNVs relative to the respective non-allele-specific control SNVs. Statistically significant depletion in ASE suggests that ASE SNVs are under less purifying selection.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4837449&req=5

f6: A considerable fraction of allele-specific variants are rare but do not form the majority.A lower proportion of allele-specific SNVs than non-allele-specific SNVs are rare, suggesting less selective constraints in allele-specific SNVs. The MAF spectra of ASB (green filled circle), control non-ASB SNVs (green open circle), ASE (blue filled circle) and control non-ASE SNVs (blue open circle) are plotted at a bin size of 100. The peaks are in the bin for MAF≤0.5% (corresponding to 0.005 in Figure 6). The inset zooms in on the histogram at MAF≤2.5%. The proportion of rare variants in descending order: ASE−>ASE+>ASB+>ASB−. Comparing ASE+ with ASE− gives an odds ratio of 0.3 (Bonferroni-corrected hypergeometric P<2.2e−16), while comparing ASB+ with ASB−, gives an odds ratio of 1.3 (P=0.2), signifying statistically significant depletion of ASE SNVs but statistically insignificant enrichment of ASB SNVs relative to the respective non-allele-specific control SNVs. Statistically significant depletion in ASE suggests that ASE SNVs are under less purifying selection.
Mentions: To examine selective constraints in allele-specific SNVs, we then consider the enrichment of rare variants with MAF≤0.5% (refs 3, 28). Figure 6 shows a shift of the allele frequency spectrum towards very low allele frequencies in all allele-specific and non-allele-specific SNVs, peaking at MAF≤0.5%. We limit our analyses for ASE SNVs to only those found in CDS regions and ASB SNVs to only those found within known TF motifs (among the 708 non-coding categories in Supplementary Data 6). In general, ASE SNVs are shown to have a greater enrichment of rare variants than ASB SNVs. This is probably due to the background of ASE SNVs being in genes versus ASB SNVs mostly in non-coding regions of the genome. Our results in Fig. 6 show a statistically significant lower enrichment of rare variants in ASE SNVs as compared with non-ASE SNVs (Fisher's exact test odds ratio=0.3, P<2.2e−16), but statistically insignificant higher enrichment of rare variants in ASB SNVs than non-ASB SNVs (Fisher's exact test odds ratio=1.3, P=0.2). This observation suggests that ASE variants may be under weaker selection than non-ASE variants.

Bottom Line: Here, we provide insights into the functional effect of these variants using allele-specific behaviour.Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'.Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.

ABSTRACT
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

No MeSH data available.


Related in: MedlinePlus