Limits...
A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.

Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M - Nat Commun (2016)

Bottom Line: Here, we provide insights into the functional effect of these variants using allele-specific behaviour.Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'.Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.

ABSTRACT
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

No MeSH data available.


Related in: MedlinePlus

Part of the ZNF331 gene on chromosome 19, position 54,041,442-54,081,633 (hg19).(a) ASB and ASE SNVs in allele-specific gene ZNF331. From AlleleDB, we can observe the ASB SNVs (filled red bars with the name of the TF above the bars) and ASE SNVs (filled black bars) found in each individual (row) and genomic positions (columns) along the ZNF331 gene. We can see that many of these SNVs are sparsely distributed across a single individual. By collapsing or combining information from multiple individuals, we can identify genomic regions or elements that are enriched for allele-specific activity. Unfilled black and red bars denote control SNVs are heterozygous SNVs that have enough reads to be tested but are non-allele-specific. (b) Two approaches for enrichment analyses are performed for each genomic element. (1) The ‘expanded' enrichment is performed in a population-aware fashion, in which each occurrence of allele-specific or control non-allele-specific SNV in each individual is counted. (2) The ‘collapsed' enrichment conflates all occurrences over multiple individuals into a single unique SNV position as long as an allele-specific or accessible non-allele-specific SNV occurs in at least one individual.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4837449&req=5

f3: Part of the ZNF331 gene on chromosome 19, position 54,041,442-54,081,633 (hg19).(a) ASB and ASE SNVs in allele-specific gene ZNF331. From AlleleDB, we can observe the ASB SNVs (filled red bars with the name of the TF above the bars) and ASE SNVs (filled black bars) found in each individual (row) and genomic positions (columns) along the ZNF331 gene. We can see that many of these SNVs are sparsely distributed across a single individual. By collapsing or combining information from multiple individuals, we can identify genomic regions or elements that are enriched for allele-specific activity. Unfilled black and red bars denote control SNVs are heterozygous SNVs that have enough reads to be tested but are non-allele-specific. (b) Two approaches for enrichment analyses are performed for each genomic element. (1) The ‘expanded' enrichment is performed in a population-aware fashion, in which each occurrence of allele-specific or control non-allele-specific SNV in each individual is counted. (2) The ‘collapsed' enrichment conflates all occurrences over multiple individuals into a single unique SNV position as long as an allele-specific or accessible non-allele-specific SNV occurs in at least one individual.

Mentions: We built a database, AlleleDB (http://alleledb.gersteinlab.org/), to house the annotations, the allele-specific and accessible SNVs. AlleleDB can be downloaded as flat files or queried and visualized directly as a UCSC track in the UCSC Genome browser18 as specific genes or genomic locations. This enables cross-referencing of allele-specific variants with other track-based data sets and analyses, and makes it amenable to all functionalities of the UCSC Genome browser. Heterozygous SNVs found in the stipulated query genomic region are colour-coded in the displayed track; Fig. 3a shows a schematic that illustrates an example of a visualization. Such visualization allows ASB and ASE to be viewed together conveniently. By building the resource using the individuals and variants from the 1000 Genomes Project, AlleleDB also serves as an allele-specific annotation of the 1000 Genomes Project variant catalogue (Supplementary Data 1 and 2).


A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.

Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M - Nat Commun (2016)

Part of the ZNF331 gene on chromosome 19, position 54,041,442-54,081,633 (hg19).(a) ASB and ASE SNVs in allele-specific gene ZNF331. From AlleleDB, we can observe the ASB SNVs (filled red bars with the name of the TF above the bars) and ASE SNVs (filled black bars) found in each individual (row) and genomic positions (columns) along the ZNF331 gene. We can see that many of these SNVs are sparsely distributed across a single individual. By collapsing or combining information from multiple individuals, we can identify genomic regions or elements that are enriched for allele-specific activity. Unfilled black and red bars denote control SNVs are heterozygous SNVs that have enough reads to be tested but are non-allele-specific. (b) Two approaches for enrichment analyses are performed for each genomic element. (1) The ‘expanded' enrichment is performed in a population-aware fashion, in which each occurrence of allele-specific or control non-allele-specific SNV in each individual is counted. (2) The ‘collapsed' enrichment conflates all occurrences over multiple individuals into a single unique SNV position as long as an allele-specific or accessible non-allele-specific SNV occurs in at least one individual.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4837449&req=5

f3: Part of the ZNF331 gene on chromosome 19, position 54,041,442-54,081,633 (hg19).(a) ASB and ASE SNVs in allele-specific gene ZNF331. From AlleleDB, we can observe the ASB SNVs (filled red bars with the name of the TF above the bars) and ASE SNVs (filled black bars) found in each individual (row) and genomic positions (columns) along the ZNF331 gene. We can see that many of these SNVs are sparsely distributed across a single individual. By collapsing or combining information from multiple individuals, we can identify genomic regions or elements that are enriched for allele-specific activity. Unfilled black and red bars denote control SNVs are heterozygous SNVs that have enough reads to be tested but are non-allele-specific. (b) Two approaches for enrichment analyses are performed for each genomic element. (1) The ‘expanded' enrichment is performed in a population-aware fashion, in which each occurrence of allele-specific or control non-allele-specific SNV in each individual is counted. (2) The ‘collapsed' enrichment conflates all occurrences over multiple individuals into a single unique SNV position as long as an allele-specific or accessible non-allele-specific SNV occurs in at least one individual.
Mentions: We built a database, AlleleDB (http://alleledb.gersteinlab.org/), to house the annotations, the allele-specific and accessible SNVs. AlleleDB can be downloaded as flat files or queried and visualized directly as a UCSC track in the UCSC Genome browser18 as specific genes or genomic locations. This enables cross-referencing of allele-specific variants with other track-based data sets and analyses, and makes it amenable to all functionalities of the UCSC Genome browser. Heterozygous SNVs found in the stipulated query genomic region are colour-coded in the displayed track; Fig. 3a shows a schematic that illustrates an example of a visualization. Such visualization allows ASB and ASE to be viewed together conveniently. By building the resource using the individuals and variants from the 1000 Genomes Project, AlleleDB also serves as an allele-specific annotation of the 1000 Genomes Project variant catalogue (Supplementary Data 1 and 2).

Bottom Line: Here, we provide insights into the functional effect of these variants using allele-specific behaviour.Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'.Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.

ABSTRACT
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

No MeSH data available.


Related in: MedlinePlus