GBshape: a genome browser database for DNA shape annotations.
Bottom Line: Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif.Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families.As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species.
Affiliation: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA.Show MeSH
Related in: MedlinePlus
Mentions: The core of our database is a high-throughput prediction platform (Figure 1) that we developed to generate DNA shape data for storage in GBshape. Whole genome sequence files (in FASTA format) for multiple species are subjected to the high-throughput prediction programs DNAshape (14) and ORChID2 (18) that are embedded in the GBshape platform. The GBshape prediction platform was designed to be extendable by plug-ins of other whole-genome annotation programs (Figure 1). The results of high-throughput prediction programs are converted to the bigWig data format, which can be displayed in a genome browser. The platform was developed in C++ and runs on a high-performance computing cluster (HPCC).
Affiliation: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA.