Bottom Line: Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets.A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations.Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl.
Affiliation: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.Show MeSH
Mentions: The benefits of refreshing the gene set include removing gene models built from data that have since been discarded from the public databases, adding new isoforms that are supported by new data and annotation of new or improved genomic regions. In some cases genes that were annotated as non-coding in GRCh37.p13 are now coding in the new assembly (Figure 1a). A full Ensembl Gene set was produced on GRCh38, then merged with manual annotation from HAVANA to produce the GENCODE 20 gene set (4) and made available in Ensembl release 76 (August 2014). Regular updates of this manual annotation are planned in the coming year. For large consortia working on human data, we recommend using the GENCODE 21 gene set, made available in Ensembl release 77 (October 2014). In addition to updating the gene set, we recalculated pairwise whole-genome alignments from human to all other species in Ensembl and also our cross-species genome-wide multiple sequence alignments. The Regulation team regenerated regulatory annotation based on ENCODE (5) and Roadmap Epigenomics (6) data for the new assembly using a new Regulatory Build process, described below.
Affiliation: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.