The SUPERFAMILY 1.75 database in 2014: a doubling of data.
Bottom Line: This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library.Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community.We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.
Affiliation: Computer Science, University of Bristol, Bristol, BS8 1UB, UK Matt.Oates@bristol.ac.uk.Show MeSH
Mentions: In Figure 1 we present SUPERFAMILY's curated complete proteomes coverage over the tree of all sequenced life collapsed to the rank of Class as defined by the NCBI taxonomy. This does not include all of the species-specific annotations we serve as part of UniProt proteomes collection, but does include collections from NCBI RefSeq (as of 13 August 2014) (12) and Ensembl (release 76) (13) as well as hundreds of complete proteomes we have acquired from various individual sources (upon publication). Of special note is that SUPERFAMILY now provides assignments to the latest assembly of Human GRCh38 thanks to its recent inclusion in Ensembl. The new Human assembly has 61% of sequences with at least one domain annotation, and 44% of all amino acids being annotated with SCOP domains.
Affiliation: Computer Science, University of Bristol, Bristol, BS8 1UB, UK Matt.Oates@bristol.ac.uk.