diArk--the database for eukaryotic genome and transcriptome assemblies in 2014.
Bottom Line: Eukaryotic genomes are the basis for understanding the complexity of life from populations to the molecular level.Recent technological innovations have revolutionized the speed of data generation enabling the sequencing of eukaryotic genomes and transcriptomes within days.In this new version of the database we have also integrated species, for which transcriptome assemblies are available, and we provide more analyses of assemblies.
Affiliation: Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany firstname.lastname@example.org.Show MeSH
Mentions: The most noticeable innovation from v.2 to v.3 is diArk's integration of RNA-seq data. The first nonhuman transcriptome assemblies have been submitted to and released by NCBI in late 2012. Since then, not only the diversity of sequenced species has increased rapidly (Figure 3) but also the number of species with transcriptome assemblies generated for different developmental stages and/or organs. Given the low costs of transcriptome compared to genome sequencing, the number of species with available transcriptome assemblies will pass the number of species with sequenced genomes in the near future. Several large-scale projects have already been announced and are expected to release their data this or next year, such as The 1000 plants (oneKP or 1KP) initiative (https://sites.google.com/a/ualberta.ca/onekp/), the Marine Microbial Eukaryote Transcriptome Sequencing project (24) and the Fish-T1K project (http://www.fisht1k.org/). Interestingly, there is not much overlap between species with transcriptome and genome assemblies (Figure 3). One reason is, that RNA-seq data is still rarely generated for species, for which genome assemblies have been produced, and if generated, the RNA-seq data had been used to assist in genome annotation or to generate expression profiles but not to produce independent transcriptome assemblies. In addition, many scientific questions can be answered sufficiently and faster with transcriptome data.
Affiliation: Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany email@example.com.