Limits...
A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.

Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M - Nat Commun (2016)

Bottom Line: Here, we provide insights into the functional effect of these variants using allele-specific behaviour.Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'.Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.

ABSTRACT
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

No MeSH data available.


Related in: MedlinePlus

Population-aware 'expanded' enrichment analysis shows that some genomic regions are more inclined to allele-specific regulation.The ‘expanded' analysis is performed in a population-aware manner, where each control or allele-specific SNV is counted once for each occurrence in an individual. We map variants associated with ASB (green) and ASE (blue) to various categories of genomic annotations, such as CDSs, UTRs, enhancer and promoter regions, to survey the human genome for regions more enriched in allelic behaviour. Using the control non-allele-specific SNVs as the expectation, we compute the log odds ratio for ASB and ASE SNVs separately, via Fisher's exact tests. Bonferroni-corrected: *P<0.05; **P<0.01; ***P<0.001. For each TF in AlleleDB, we also calculate the log odds ratio of ASB SNVs in promoters, providing a proxy of allele-specific regulatory role for each available TF. Genes known to be mono-allelically expressed such as imprinted and MHC genes (CDS regions) are highly enriched for both ASB and ASE SNVs. The actual log odds ratio of ASB SNVs in imprinted genes, both ASB and ASE SNVs in immunoglobulin genes and ASE SNVs for MHC is indicated on the bar.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4837449&req=5

f5: Population-aware 'expanded' enrichment analysis shows that some genomic regions are more inclined to allele-specific regulation.The ‘expanded' analysis is performed in a population-aware manner, where each control or allele-specific SNV is counted once for each occurrence in an individual. We map variants associated with ASB (green) and ASE (blue) to various categories of genomic annotations, such as CDSs, UTRs, enhancer and promoter regions, to survey the human genome for regions more enriched in allelic behaviour. Using the control non-allele-specific SNVs as the expectation, we compute the log odds ratio for ASB and ASE SNVs separately, via Fisher's exact tests. Bonferroni-corrected: *P<0.05; **P<0.01; ***P<0.001. For each TF in AlleleDB, we also calculate the log odds ratio of ASB SNVs in promoters, providing a proxy of allele-specific regulatory role for each available TF. Genes known to be mono-allelically expressed such as imprinted and MHC genes (CDS regions) are highly enriched for both ASB and ASE SNVs. The actual log odds ratio of ASB SNVs in imprinted genes, both ASB and ASE SNVs in immunoglobulin genes and ASE SNVs for MHC is indicated on the bar.

Mentions: In addition, we extend the enrichment analyses to gene elements, such as introns and promoter regions. Figure 5 and Supplementary Fig. 1 shows the enrichment of allele-specific SNVs in elements closely related to a gene model, namely enhancers, promoters, CDSs, introns and untranslated regions (UTRs). For SNVs associated with ASB, we observed an enrichment in the 5′-UTRs. This is in-line with an enrichment of ASB SNVs in promoters, suggesting functional roles of these variants in regulating gene expression. We see variable enrichments of ASB SNVs in the peaks of particular TFs such as POL2, SA1 (cohesin subunit) and CTCF in promoter regions, while depletion in others, such as PU.1 (Fig. 5, Supplementary Data 4). These differences might imply that some TFs are more likely to participate in allele-specific regulation than others. Between the two enrichment analyses, we observe more consistent trends in the odds ratios of ASB SNVs than ASE SNVs. The differences are most likely contributed by the presence of common SNVs that are also behaving consistently (either being allele-specific or non-allele-specific) over multiple individuals.


A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.

Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M - Nat Commun (2016)

Population-aware 'expanded' enrichment analysis shows that some genomic regions are more inclined to allele-specific regulation.The ‘expanded' analysis is performed in a population-aware manner, where each control or allele-specific SNV is counted once for each occurrence in an individual. We map variants associated with ASB (green) and ASE (blue) to various categories of genomic annotations, such as CDSs, UTRs, enhancer and promoter regions, to survey the human genome for regions more enriched in allelic behaviour. Using the control non-allele-specific SNVs as the expectation, we compute the log odds ratio for ASB and ASE SNVs separately, via Fisher's exact tests. Bonferroni-corrected: *P<0.05; **P<0.01; ***P<0.001. For each TF in AlleleDB, we also calculate the log odds ratio of ASB SNVs in promoters, providing a proxy of allele-specific regulatory role for each available TF. Genes known to be mono-allelically expressed such as imprinted and MHC genes (CDS regions) are highly enriched for both ASB and ASE SNVs. The actual log odds ratio of ASB SNVs in imprinted genes, both ASB and ASE SNVs in immunoglobulin genes and ASE SNVs for MHC is indicated on the bar.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4837449&req=5

f5: Population-aware 'expanded' enrichment analysis shows that some genomic regions are more inclined to allele-specific regulation.The ‘expanded' analysis is performed in a population-aware manner, where each control or allele-specific SNV is counted once for each occurrence in an individual. We map variants associated with ASB (green) and ASE (blue) to various categories of genomic annotations, such as CDSs, UTRs, enhancer and promoter regions, to survey the human genome for regions more enriched in allelic behaviour. Using the control non-allele-specific SNVs as the expectation, we compute the log odds ratio for ASB and ASE SNVs separately, via Fisher's exact tests. Bonferroni-corrected: *P<0.05; **P<0.01; ***P<0.001. For each TF in AlleleDB, we also calculate the log odds ratio of ASB SNVs in promoters, providing a proxy of allele-specific regulatory role for each available TF. Genes known to be mono-allelically expressed such as imprinted and MHC genes (CDS regions) are highly enriched for both ASB and ASE SNVs. The actual log odds ratio of ASB SNVs in imprinted genes, both ASB and ASE SNVs in immunoglobulin genes and ASE SNVs for MHC is indicated on the bar.
Mentions: In addition, we extend the enrichment analyses to gene elements, such as introns and promoter regions. Figure 5 and Supplementary Fig. 1 shows the enrichment of allele-specific SNVs in elements closely related to a gene model, namely enhancers, promoters, CDSs, introns and untranslated regions (UTRs). For SNVs associated with ASB, we observed an enrichment in the 5′-UTRs. This is in-line with an enrichment of ASB SNVs in promoters, suggesting functional roles of these variants in regulating gene expression. We see variable enrichments of ASB SNVs in the peaks of particular TFs such as POL2, SA1 (cohesin subunit) and CTCF in promoter regions, while depletion in others, such as PU.1 (Fig. 5, Supplementary Data 4). These differences might imply that some TFs are more likely to participate in allele-specific regulation than others. Between the two enrichment analyses, we observe more consistent trends in the odds ratios of ASB SNVs than ASE SNVs. The differences are most likely contributed by the presence of common SNVs that are also behaving consistently (either being allele-specific or non-allele-specific) over multiple individuals.

Bottom Line: Here, we provide insights into the functional effect of these variants using allele-specific behaviour.Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'.Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.

ABSTRACT
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

No MeSH data available.


Related in: MedlinePlus