Limits...
Nonidentifiability of the source of intrinsic noise in gene expression from single-burst data.

Ingram PJ, Stumpf MP, Stark J - PLoS Comput. Biol. (2008)

Bottom Line: We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter.Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution.We explore methods of inferring these values when additional types of experimental data are available.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Imperial College London, London, United Kingdom. piers.ingram@imperial.ac.uk

ABSTRACT
Over the last few years, experimental data on the fluctuations in gene activity between individual cells and within the same cell over time have confirmed that gene expression is a "noisy" process. This variation is in part due to the small number of molecules taking part in some of the key reactions that are involved in gene expression. One of the consequences of this is that protein production often occurs in bursts, each due to a single promoter or transcription factor binding event. Recently, the distribution of the number of proteins produced in such bursts has been experimentally measured, offering a unique opportunity to study the relative importance of different sources of noise in gene expression. Here, we provide a derivation of the theoretical probability distribution of these bursts for a wide variety of different models of gene expression. We show that there is a good fit between our theoretical distribution and that obtained from two different published experimental datasets. We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter. Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution. It is thus impossible to identify different sources of fluctuations purely from protein burst size data or to use such data to estimate all of the model parameters. We explore methods of inferring these values when additional types of experimental data are available.

Show MeSH

Related in: MedlinePlus

Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of the                                log-likelihood for the parameters                                α0,                                    α1,                                    β1, and                                    β2, with                                    α2 determined by the                                relationship in Equation 6, and with the protein degradation rate                                set, β3 set to                                    2.77×10−4, consistent with a                                half-life for β-gal 60 mins. The panels in                                column A show the estimates of the values of the parameters and the                                percentage of times the Nelder-Mead algorithm converged to those                                values. The panels in column B are scattergrams of the values of the                                parameter estimates against the value of the log-likelihood. Each                                simulation is run 10,000 times to simulate a population of 10,000                                cells, and each simulation is run for 5000 reaction steps. The                                starting values for the optimisation routine are:                                    α0 = 0.01                                    s−1,                                α1 = 0.02                                    s−1,                                β1 = 0.1                                    s−1, and                                β2 = 0.007                                    s−1, and are based on previous simulation                                studies [16].
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2538572&req=5

pcbi-1000192-g006: Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of the log-likelihood for the parameters α0, α1, β1, and β2, with α2 determined by the relationship in Equation 6, and with the protein degradation rate set, β3 set to 2.77×10−4, consistent with a half-life for β-gal 60 mins. The panels in column A show the estimates of the values of the parameters and the percentage of times the Nelder-Mead algorithm converged to those values. The panels in column B are scattergrams of the values of the parameter estimates against the value of the log-likelihood. Each simulation is run 10,000 times to simulate a population of 10,000 cells, and each simulation is run for 5000 reaction steps. The starting values for the optimisation routine are: α0 = 0.01 s−1, α1 = 0.02 s−1, β1 = 0.1 s−1, and β2 = 0.007 s−1, and are based on previous simulation studies [16].

Mentions: We therefore fixed β3 to 1.92×10−4 s−1, corresponding to a protein half-life of one hour, and then used same method as described above to estimate the other parameters α0, α1, α2, β1, and β2. We ran 10,000 simulations, as a relatively low number of runs converged (23.37%), with the others becoming trapped in a region with physically unrealistic (negative) reaction rates, and a log-likelihood of L≈−2100. Of the runs which converged, 2057 (88%) converged to a local maximum at L≈−9150, while 279 (12%) converged to the presumed global maximum at L≈−1100. The results for the runs which converged can be seen in Figure 6, whilst summary statistics for the runs which converged to the presumed global maximum are presented in Table 2.


Nonidentifiability of the source of intrinsic noise in gene expression from single-burst data.

Ingram PJ, Stumpf MP, Stark J - PLoS Comput. Biol. (2008)

Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of the                                log-likelihood for the parameters                                α0,                                    α1,                                    β1, and                                    β2, with                                    α2 determined by the                                relationship in Equation 6, and with the protein degradation rate                                set, β3 set to                                    2.77×10−4, consistent with a                                half-life for β-gal 60 mins. The panels in                                column A show the estimates of the values of the parameters and the                                percentage of times the Nelder-Mead algorithm converged to those                                values. The panels in column B are scattergrams of the values of the                                parameter estimates against the value of the log-likelihood. Each                                simulation is run 10,000 times to simulate a population of 10,000                                cells, and each simulation is run for 5000 reaction steps. The                                starting values for the optimisation routine are:                                    α0 = 0.01                                    s−1,                                α1 = 0.02                                    s−1,                                β1 = 0.1                                    s−1, and                                β2 = 0.007                                    s−1, and are based on previous simulation                                studies [16].
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2538572&req=5

pcbi-1000192-g006: Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of the log-likelihood for the parameters α0, α1, β1, and β2, with α2 determined by the relationship in Equation 6, and with the protein degradation rate set, β3 set to 2.77×10−4, consistent with a half-life for β-gal 60 mins. The panels in column A show the estimates of the values of the parameters and the percentage of times the Nelder-Mead algorithm converged to those values. The panels in column B are scattergrams of the values of the parameter estimates against the value of the log-likelihood. Each simulation is run 10,000 times to simulate a population of 10,000 cells, and each simulation is run for 5000 reaction steps. The starting values for the optimisation routine are: α0 = 0.01 s−1, α1 = 0.02 s−1, β1 = 0.1 s−1, and β2 = 0.007 s−1, and are based on previous simulation studies [16].
Mentions: We therefore fixed β3 to 1.92×10−4 s−1, corresponding to a protein half-life of one hour, and then used same method as described above to estimate the other parameters α0, α1, α2, β1, and β2. We ran 10,000 simulations, as a relatively low number of runs converged (23.37%), with the others becoming trapped in a region with physically unrealistic (negative) reaction rates, and a log-likelihood of L≈−2100. Of the runs which converged, 2057 (88%) converged to a local maximum at L≈−9150, while 279 (12%) converged to the presumed global maximum at L≈−1100. The results for the runs which converged can be seen in Figure 6, whilst summary statistics for the runs which converged to the presumed global maximum are presented in Table 2.

Bottom Line: We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter.Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution.We explore methods of inferring these values when additional types of experimental data are available.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Imperial College London, London, United Kingdom. piers.ingram@imperial.ac.uk

ABSTRACT
Over the last few years, experimental data on the fluctuations in gene activity between individual cells and within the same cell over time have confirmed that gene expression is a "noisy" process. This variation is in part due to the small number of molecules taking part in some of the key reactions that are involved in gene expression. One of the consequences of this is that protein production often occurs in bursts, each due to a single promoter or transcription factor binding event. Recently, the distribution of the number of proteins produced in such bursts has been experimentally measured, offering a unique opportunity to study the relative importance of different sources of noise in gene expression. Here, we provide a derivation of the theoretical probability distribution of these bursts for a wide variety of different models of gene expression. We show that there is a good fit between our theoretical distribution and that obtained from two different published experimental datasets. We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter. Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution. It is thus impossible to identify different sources of fluctuations purely from protein burst size data or to use such data to estimate all of the model parameters. We explore methods of inferring these values when additional types of experimental data are available.

Show MeSH
Related in: MedlinePlus