Limits...
Efficient conformational space exploration in ab initio protein folding simulation.

Ullah A, Ahmed N, Pappu SD, Shatabda S, Ullah AZ, Rahman MS - R Soc Open Sci (2015)

Bottom Line: However, these energy functions are not very informative for search algorithms and fail to distinguish the types of amino acid interactions that contribute largely to the energy function from those that do not.As a result, search algorithms frequently get trapped into the local minima.Number of objective function evaluation in a single run of the algorithm is used as a comparison metric to demonstrate efficiency.

View Article: PubMed Central - PubMed

Affiliation: AℓEDA Group, Department of CSE , BUET , ECE Building, Dhaka 1205, Bangladesh ; Department of CSE , Independent University , Bangladesh, Dhaka 1229, Bangladesh.

ABSTRACT
Ab initio protein folding simulation largely depends on knowledge-based energy functions that are derived from known protein structures using statistical methods. These knowledge-based energy functions provide us with a good approximation of real protein energetics. However, these energy functions are not very informative for search algorithms and fail to distinguish the types of amino acid interactions that contribute largely to the energy function from those that do not. As a result, search algorithms frequently get trapped into the local minima. On the other hand, the hydrophobic-polar (HP) model considers hydrophobic interactions only. The simplified nature of HP energy function makes it limited only to a low-resolution model. In this paper, we present a strategy to derive a non-uniform scaled version of the real 20×20 pairwise energy function. The non-uniform scaling helps tackle the difficulty faced by a real energy function, whereas the integration of 20×20 pairwise information overcomes the limitations faced by the HP energy function. Here, we have applied a derived energy function with a genetic algorithm on discrete lattices. On a standard set of benchmark protein sequences, our approach significantly outperforms the state-of-the-art methods for similar models. Our approach has been able to explore regions of the conformational space which all the previous methods have failed to explore. Effectiveness of the derived energy function is presented by showing qualitative differences and similarities of the sampled structures to the native structures. Number of objective function evaluation in a single run of the algorithm is used as a comparison metric to demonstrate efficiency.

No MeSH data available.


Differences of best energy obtained by using derived energy function and mixed model [12].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4555859&req=5

RSOS150238F3: Differences of best energy obtained by using derived energy function and mixed model [12].

Mentions: Furthermore, we also report the improvement of our approach over the algorithm of Rashid et al. [12] based on the the average and best energy levels recorded after 30 min, 1 h and 2 h in figure 2 and in figure 3, respectively. In these figures we plot the improvement achieved by our algorithms for different run times for each of the proteins. We clearly see that even our 30 min energy values are lower than the values achieved by Rashid et al. [12]. We also show our energy levels at the 2 h cut-off time to illustrate that our approach can achieve further improvement if the cut-off time is delayed. Three distinct linear trend lines with downward slope suggest that with the increase in dimensionality or size of the protein sequence our approach performs better. Exact energy values used to generate figures 2 and 3 are presented in the electronic supplementary material, table S2. Informatively, in figures 2–7, each of the linear trend lines is the result of the linear regression of the dataset of each time interval as follows. Each line is based on the equation: y=mx+c, where and . Here n is number of data points, x's are the number of amino acids and y's are the differences of energies or ratios of the exploration metric.Figure 2.


Efficient conformational space exploration in ab initio protein folding simulation.

Ullah A, Ahmed N, Pappu SD, Shatabda S, Ullah AZ, Rahman MS - R Soc Open Sci (2015)

Differences of best energy obtained by using derived energy function and mixed model [12].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4555859&req=5

RSOS150238F3: Differences of best energy obtained by using derived energy function and mixed model [12].
Mentions: Furthermore, we also report the improvement of our approach over the algorithm of Rashid et al. [12] based on the the average and best energy levels recorded after 30 min, 1 h and 2 h in figure 2 and in figure 3, respectively. In these figures we plot the improvement achieved by our algorithms for different run times for each of the proteins. We clearly see that even our 30 min energy values are lower than the values achieved by Rashid et al. [12]. We also show our energy levels at the 2 h cut-off time to illustrate that our approach can achieve further improvement if the cut-off time is delayed. Three distinct linear trend lines with downward slope suggest that with the increase in dimensionality or size of the protein sequence our approach performs better. Exact energy values used to generate figures 2 and 3 are presented in the electronic supplementary material, table S2. Informatively, in figures 2–7, each of the linear trend lines is the result of the linear regression of the dataset of each time interval as follows. Each line is based on the equation: y=mx+c, where and . Here n is number of data points, x's are the number of amino acids and y's are the differences of energies or ratios of the exploration metric.Figure 2.

Bottom Line: However, these energy functions are not very informative for search algorithms and fail to distinguish the types of amino acid interactions that contribute largely to the energy function from those that do not.As a result, search algorithms frequently get trapped into the local minima.Number of objective function evaluation in a single run of the algorithm is used as a comparison metric to demonstrate efficiency.

View Article: PubMed Central - PubMed

Affiliation: AℓEDA Group, Department of CSE , BUET , ECE Building, Dhaka 1205, Bangladesh ; Department of CSE , Independent University , Bangladesh, Dhaka 1229, Bangladesh.

ABSTRACT
Ab initio protein folding simulation largely depends on knowledge-based energy functions that are derived from known protein structures using statistical methods. These knowledge-based energy functions provide us with a good approximation of real protein energetics. However, these energy functions are not very informative for search algorithms and fail to distinguish the types of amino acid interactions that contribute largely to the energy function from those that do not. As a result, search algorithms frequently get trapped into the local minima. On the other hand, the hydrophobic-polar (HP) model considers hydrophobic interactions only. The simplified nature of HP energy function makes it limited only to a low-resolution model. In this paper, we present a strategy to derive a non-uniform scaled version of the real 20×20 pairwise energy function. The non-uniform scaling helps tackle the difficulty faced by a real energy function, whereas the integration of 20×20 pairwise information overcomes the limitations faced by the HP energy function. Here, we have applied a derived energy function with a genetic algorithm on discrete lattices. On a standard set of benchmark protein sequences, our approach significantly outperforms the state-of-the-art methods for similar models. Our approach has been able to explore regions of the conformational space which all the previous methods have failed to explore. Effectiveness of the derived energy function is presented by showing qualitative differences and similarities of the sampled structures to the native structures. Number of objective function evaluation in a single run of the algorithm is used as a comparison metric to demonstrate efficiency.

No MeSH data available.