Limits...
Biochemical functional predictions for protein structures of unknown or uncertain function.

Mills CL, Beuning PJ, Ondrechen MJ - Comput Struct Biotechnol J (2015)

Bottom Line: In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function.Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function.These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, United States.

ABSTRACT
With the exponential growth in the determination of protein sequences and structures via genome sequencing and structural genomics efforts, there is a growing need for reliable computational methods to determine the biochemical function of these proteins. This paper reviews the efforts to address the challenge of annotating the function at the molecular level of uncharacterized proteins. While sequence- and three-dimensional-structure-based methods for protein function prediction have been reviewed previously, the recent trends in local structure-based methods have received less attention. These local structure-based methods are the primary focus of this review. Computational methods have been developed to predict the residues important for catalysis and the local spatial arrangements of these residues can be used to identify protein function. In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function. Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function. These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

No MeSH data available.


Schematic diagram outlining the different methods utilized in ProFunc. HMM: Hidden Markov Model; SSM: Secondary Structure Matching; HTH: Helix–Turn–Helix.
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4372640&req=5

f0015: Schematic diagram outlining the different methods utilized in ProFunc. HMM: Hidden Markov Model; SSM: Secondary Structure Matching; HTH: Helix–Turn–Helix.

Mentions: ProFunc [82] is a metaserver that combines sequence, global structure, and local structure-based methods to obtain a set of function predictions from which one might seek consensus. First, the protein of unknown function is analyzed by numerous sequence searches, shown on the left-hand side in Fig. 3. BLAST [20,21] analysis scans both the PDB and UniProt and uses multiple sequence alignment to determine sequence similarities and detect sequence motifs [83]. Gene neighbors are also examined based on the query protein's predicted location within the genome. The genes located near each other are often functionally related or functionally similar [82]. Next, structure-based analyses are performed on the protein of interest. This involves searching a number of databases for global folds or local structures that are similar to the query protein. Surfnet, mentioned in the above section, is one of these databases. Another database, secondary structure matching (SSM) [84] evaluates the secondary structure elements (SSEs) of the query protein of unknown function and compares them to the SSEs of protein structures within its database. The algorithm retrieves high, strong matches and superimposes the structures with the query protein to give a root mean square deviation (RMSD) so that a common number can be used to compare the results. Finally, ProFunc utilizes other servers to search for 3D templates of proteins with known binding sites. These binding sites may be simple active sites with the residues important for catalysis known [85], or ligand binding sites wherein residues important for catalysis are known and also the natural ligand/substrate is known. In some cases, the databases can also compare DNA-binding sites and motifs known to be associated with binding DNA.


Biochemical functional predictions for protein structures of unknown or uncertain function.

Mills CL, Beuning PJ, Ondrechen MJ - Comput Struct Biotechnol J (2015)

Schematic diagram outlining the different methods utilized in ProFunc. HMM: Hidden Markov Model; SSM: Secondary Structure Matching; HTH: Helix–Turn–Helix.
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4372640&req=5

f0015: Schematic diagram outlining the different methods utilized in ProFunc. HMM: Hidden Markov Model; SSM: Secondary Structure Matching; HTH: Helix–Turn–Helix.
Mentions: ProFunc [82] is a metaserver that combines sequence, global structure, and local structure-based methods to obtain a set of function predictions from which one might seek consensus. First, the protein of unknown function is analyzed by numerous sequence searches, shown on the left-hand side in Fig. 3. BLAST [20,21] analysis scans both the PDB and UniProt and uses multiple sequence alignment to determine sequence similarities and detect sequence motifs [83]. Gene neighbors are also examined based on the query protein's predicted location within the genome. The genes located near each other are often functionally related or functionally similar [82]. Next, structure-based analyses are performed on the protein of interest. This involves searching a number of databases for global folds or local structures that are similar to the query protein. Surfnet, mentioned in the above section, is one of these databases. Another database, secondary structure matching (SSM) [84] evaluates the secondary structure elements (SSEs) of the query protein of unknown function and compares them to the SSEs of protein structures within its database. The algorithm retrieves high, strong matches and superimposes the structures with the query protein to give a root mean square deviation (RMSD) so that a common number can be used to compare the results. Finally, ProFunc utilizes other servers to search for 3D templates of proteins with known binding sites. These binding sites may be simple active sites with the residues important for catalysis known [85], or ligand binding sites wherein residues important for catalysis are known and also the natural ligand/substrate is known. In some cases, the databases can also compare DNA-binding sites and motifs known to be associated with binding DNA.

Bottom Line: In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function.Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function.These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, United States.

ABSTRACT
With the exponential growth in the determination of protein sequences and structures via genome sequencing and structural genomics efforts, there is a growing need for reliable computational methods to determine the biochemical function of these proteins. This paper reviews the efforts to address the challenge of annotating the function at the molecular level of uncharacterized proteins. While sequence- and three-dimensional-structure-based methods for protein function prediction have been reviewed previously, the recent trends in local structure-based methods have received less attention. These local structure-based methods are the primary focus of this review. Computational methods have been developed to predict the residues important for catalysis and the local spatial arrangements of these residues can be used to identify protein function. In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function. Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function. These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

No MeSH data available.