Computational learning on specificity-determining residue-nucleotide interactions.
Bottom Line: Taking into account both sides (protein and DNA), we propose and describe a computational study for learning the specificity-determining residue-nucleotide interactions of different known DNA-binding domain families.The proposed learning models are compared to state-of-the-art models comprehensively, demonstrating its competitive learning performance.In addition, we describe and propose two applications which demonstrate how the learnt models can provide meaningful insights into protein-DNA interactions across different DNA binding families.
Affiliation: Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong email@example.com.Show MeSH
Mentions: For each DBD family, we have written network scripts to send the testing DBD sequences to the BindN web-server, BindN+ web-server and DISIS web-server for obtaining their predictions with the default settings suggested. Briefly, BindN is a support vector machine classifier using physicochemical sequence features (6). BindN+ is an extension of BindN which also takes in account the evolutionary information (29). DISIS is also a support vector machine classifier which considers evolutionary information, predicted secondary structural information and the neighboring residue information (28). The Receiver Operating Characteristic (ROC) and precision-recall (PRC) curves for the entire DBD families are plotted and shown in Figure 2 and Supplementary Figure S4. It can be observed that our proposed method using protein-only features (ours) and that using both-protein–DNA features (ours-both) have a competitive edge over the other sequence-based methods at low false positive rates.
Affiliation: Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong firstname.lastname@example.org.