UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions.
Bottom Line: The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers').The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos.This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference.
Affiliation: Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA Bioinformatics Graduate Program, Northeastern University, Boston, MA 02115, USA.Show MeSH
Mentions: Figure 1A shows the main page for this pipeline, which also outlines the control flow of the deposition for users. In the first five steps, the user can input information into the database concerning the proteins involved in their study. While the most convenient way to do this is by preparing an appropriately formatted spreadsheet file (for steps 2, 4 and 5; see Figure 1B), alternatively the input can be done one entry at a time using an HTML form if a user prefers that method. Currently, the user must prepare a folder with all of the data files they wish to make public. Instructions for data file preparation are given (and are also provided in Supplementary Text 1), and several helpful scripts are available for download to aid the process. The user then uploads the folder to the UniPROBE server as a zip file. The remaining steps fully integrate the data files into the web interface, including constructing sequence logos for each protein and making all the data easily searchable and available for download. The UniPROBE administrator will then finalize the deposition by ensuring proper insertion and moving the new data into the public version of the web site. Data depositors may contact the UniPROBE administrator to specify a release date for prepublication data submissions.
Affiliation: Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA Bioinformatics Graduate Program, Northeastern University, Boston, MA 02115, USA.