Limits...
Quantifying epistatic interactions among the components constituting the protein translation system.

Matsuura T, Kazuta Y, Aita T, Adachi J, Yomo T - Mol. Syst. Biol. (2009)

Bottom Line: Analyses of the data measured using various combinations of component concentrations indicated that the contributions of larger than 2-body inter-component epistatic interactions are negligible, despite the presence of larger than 2-body physical interactions.These findings allowed the prediction of protein synthesis activity at various combinations of component concentrations from a small number of samples, the principle of which is applicable to analysis and optimization of other biological systems.Moreover, the average ratio of 2- to 1-body terms was estimated to be as small as 0.1, implying high adaptability and evolvability of the protein translation system.

View Article: PubMed Central - PubMed

Affiliation: Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan.

ABSTRACT
In principle, the accumulation of knowledge regarding the molecular basis of biological systems should allow the development of large-scale kinetic models of their functions. However, the development of such models requires vast numbers of parameters, which are difficult to obtain in practice. Here, we used an in vitro translation system, consisting of 69 defined components, to quantify the epistatic interactions among changes in component concentrations through Bahadur expansion, thereby obtaining a coarse-grained model of protein synthesis activity. Analyses of the data measured using various combinations of component concentrations indicated that the contributions of larger than 2-body inter-component epistatic interactions are negligible, despite the presence of larger than 2-body physical interactions. These findings allowed the prediction of protein synthesis activity at various combinations of component concentrations from a small number of samples, the principle of which is applicable to analysis and optimization of other biological systems. Moreover, the average ratio of 2- to 1-body terms was estimated to be as small as 0.1, implying high adaptability and evolvability of the protein translation system.

Show MeSH
Predicting the activity from a small number of samples. (A) Correlation between the predicted and experimental data (Figure 4A) on using ‘111111' as a reference sequence to obtain up to the 2nd order Bahadur coefficients. R2=0.933 was obtained. (B) The rank order plot (or cumulative frequency distribution) of the 64 R2 values, obtained from the correlation between the experimental and predicted data calculated using each of the 64 reference sequences. Predicted data were calculated by 2nd (black circles) or 1st (gray circles) order truncation of equation (m4). The gray dashed line and the bold line show the average and s.d. of the R2 value obtained, respectively, when the prediction strategy of using up to the 2nd order Bahadur coefficients was applied to predict the respective 100 data sets in which the sequence–activity relationship was shuffled randomly.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2736649&req=5

f6: Predicting the activity from a small number of samples. (A) Correlation between the predicted and experimental data (Figure 4A) on using ‘111111' as a reference sequence to obtain up to the 2nd order Bahadur coefficients. R2=0.933 was obtained. (B) The rank order plot (or cumulative frequency distribution) of the 64 R2 values, obtained from the correlation between the experimental and predicted data calculated using each of the 64 reference sequences. Predicted data were calculated by 2nd (black circles) or 1st (gray circles) order truncation of equation (m4). The gray dashed line and the bold line show the average and s.d. of the R2 value obtained, respectively, when the prediction strategy of using up to the 2nd order Bahadur coefficients was applied to predict the respective 100 data sets in which the sequence–activity relationship was shuffled randomly.

Mentions: As biological systems consist of vast numbers of components, it would be useful to be able to predict the activity values under vast numbers of conditions with different combinations of component concentrations (Yin and Carter, 1996; Young et al, 1997; Arita et al, 2002; Benos et al, 2002; Chester et al, 2004; Wiedemann et al, 2004). The absence of larger than 2-body inter-component interactions means that activity values of the in vitro translation system can be predicted by estimating up to the 2nd order Bahadur coefficients. To estimate those for a binary sequence with a length of n, a set of activity of at least nC0+nC1+nC2=0.5 × (2+n+n2) sequences is needed. Once these coefficients are obtained, it is possible to predict the results of all other possible sequences (2n−0.5 × (2+n+n2)). As an example, we tested the predictability using the data in which fluorescence intensity is defined by a binary sequence of length six (Figure 4A). In this case, at least 22 experimental data are needed to estimate the 2nd order Bahadur coefficients for prediction of the other 42 (=26−22) results. A typical scheme for choosing the 22 data (and sequences) is as follows. First, pick a reference sequence (e.g., ‘111111'), and then all possible single-point mutants (‘211111,' ‘121111,'…, ‘111112'), and the double-point mutants (‘221111,' ‘212111,'…, ‘111122'). Note that although the selection strategy often follows the theory of the design of experiments (Fisher, 1966), our simple scheme was sufficient for accurate prediction as described below. Using the 22 sequence–activity relationships, up to the 2nd order Bahadur coefficients can be estimated using equation (m4) (Materials and methods), which then allow prediction of the remaining 44 samples. Figure 6A shows the correlation between the experimental and predicted data using ‘111111' as a reference sequence; the prediction showed good agreement with the experimental data. Figure 6B shows R2 values calculated similarly using each of the 64 as a reference sequence. This rank order plot shows that the R2 value was >0.8 in 57 of 64 cases and thus high R2 values could be obtained with 90% probability. Such high R2 values were not obtained using the same prediction by the 1st order truncation, indicating the necessity of 2nd order coefficients for accurate prediction. Furthermore, when the strategy of 2nd order truncation was applied to the prediction of the data sets in which the sequence–activity relationship was shuffled randomly, we obtained an average R2 value of 0.025, indicating the necessity of considering up to 2-body interactions for accurate prediction. The methodology presented here is effective for prediction and optimization of other biological systems, particularly if their higher order epistatic interactions are estimated to be negligible as in the protein translation system.


Quantifying epistatic interactions among the components constituting the protein translation system.

Matsuura T, Kazuta Y, Aita T, Adachi J, Yomo T - Mol. Syst. Biol. (2009)

Predicting the activity from a small number of samples. (A) Correlation between the predicted and experimental data (Figure 4A) on using ‘111111' as a reference sequence to obtain up to the 2nd order Bahadur coefficients. R2=0.933 was obtained. (B) The rank order plot (or cumulative frequency distribution) of the 64 R2 values, obtained from the correlation between the experimental and predicted data calculated using each of the 64 reference sequences. Predicted data were calculated by 2nd (black circles) or 1st (gray circles) order truncation of equation (m4). The gray dashed line and the bold line show the average and s.d. of the R2 value obtained, respectively, when the prediction strategy of using up to the 2nd order Bahadur coefficients was applied to predict the respective 100 data sets in which the sequence–activity relationship was shuffled randomly.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2736649&req=5

f6: Predicting the activity from a small number of samples. (A) Correlation between the predicted and experimental data (Figure 4A) on using ‘111111' as a reference sequence to obtain up to the 2nd order Bahadur coefficients. R2=0.933 was obtained. (B) The rank order plot (or cumulative frequency distribution) of the 64 R2 values, obtained from the correlation between the experimental and predicted data calculated using each of the 64 reference sequences. Predicted data were calculated by 2nd (black circles) or 1st (gray circles) order truncation of equation (m4). The gray dashed line and the bold line show the average and s.d. of the R2 value obtained, respectively, when the prediction strategy of using up to the 2nd order Bahadur coefficients was applied to predict the respective 100 data sets in which the sequence–activity relationship was shuffled randomly.
Mentions: As biological systems consist of vast numbers of components, it would be useful to be able to predict the activity values under vast numbers of conditions with different combinations of component concentrations (Yin and Carter, 1996; Young et al, 1997; Arita et al, 2002; Benos et al, 2002; Chester et al, 2004; Wiedemann et al, 2004). The absence of larger than 2-body inter-component interactions means that activity values of the in vitro translation system can be predicted by estimating up to the 2nd order Bahadur coefficients. To estimate those for a binary sequence with a length of n, a set of activity of at least nC0+nC1+nC2=0.5 × (2+n+n2) sequences is needed. Once these coefficients are obtained, it is possible to predict the results of all other possible sequences (2n−0.5 × (2+n+n2)). As an example, we tested the predictability using the data in which fluorescence intensity is defined by a binary sequence of length six (Figure 4A). In this case, at least 22 experimental data are needed to estimate the 2nd order Bahadur coefficients for prediction of the other 42 (=26−22) results. A typical scheme for choosing the 22 data (and sequences) is as follows. First, pick a reference sequence (e.g., ‘111111'), and then all possible single-point mutants (‘211111,' ‘121111,'…, ‘111112'), and the double-point mutants (‘221111,' ‘212111,'…, ‘111122'). Note that although the selection strategy often follows the theory of the design of experiments (Fisher, 1966), our simple scheme was sufficient for accurate prediction as described below. Using the 22 sequence–activity relationships, up to the 2nd order Bahadur coefficients can be estimated using equation (m4) (Materials and methods), which then allow prediction of the remaining 44 samples. Figure 6A shows the correlation between the experimental and predicted data using ‘111111' as a reference sequence; the prediction showed good agreement with the experimental data. Figure 6B shows R2 values calculated similarly using each of the 64 as a reference sequence. This rank order plot shows that the R2 value was >0.8 in 57 of 64 cases and thus high R2 values could be obtained with 90% probability. Such high R2 values were not obtained using the same prediction by the 1st order truncation, indicating the necessity of 2nd order coefficients for accurate prediction. Furthermore, when the strategy of 2nd order truncation was applied to the prediction of the data sets in which the sequence–activity relationship was shuffled randomly, we obtained an average R2 value of 0.025, indicating the necessity of considering up to 2-body interactions for accurate prediction. The methodology presented here is effective for prediction and optimization of other biological systems, particularly if their higher order epistatic interactions are estimated to be negligible as in the protein translation system.

Bottom Line: Analyses of the data measured using various combinations of component concentrations indicated that the contributions of larger than 2-body inter-component epistatic interactions are negligible, despite the presence of larger than 2-body physical interactions.These findings allowed the prediction of protein synthesis activity at various combinations of component concentrations from a small number of samples, the principle of which is applicable to analysis and optimization of other biological systems.Moreover, the average ratio of 2- to 1-body terms was estimated to be as small as 0.1, implying high adaptability and evolvability of the protein translation system.

View Article: PubMed Central - PubMed

Affiliation: Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan.

ABSTRACT
In principle, the accumulation of knowledge regarding the molecular basis of biological systems should allow the development of large-scale kinetic models of their functions. However, the development of such models requires vast numbers of parameters, which are difficult to obtain in practice. Here, we used an in vitro translation system, consisting of 69 defined components, to quantify the epistatic interactions among changes in component concentrations through Bahadur expansion, thereby obtaining a coarse-grained model of protein synthesis activity. Analyses of the data measured using various combinations of component concentrations indicated that the contributions of larger than 2-body inter-component epistatic interactions are negligible, despite the presence of larger than 2-body physical interactions. These findings allowed the prediction of protein synthesis activity at various combinations of component concentrations from a small number of samples, the principle of which is applicable to analysis and optimization of other biological systems. Moreover, the average ratio of 2- to 1-body terms was estimated to be as small as 0.1, implying high adaptability and evolvability of the protein translation system.

Show MeSH