Limits...
Hierarchical models in the brain.

Friston K - PLoS Comput. Biol. (2008)

Bottom Line: This means that a single model and optimisation scheme can be used to invert a wide range of models.We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data.We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain.

View Article: PubMed Central - PubMed

Affiliation: The Wellcome Trust Centre of Neuroimaging, University College London, London, United Kingdom. k.friston@fil.ion.ucl.ac.uk

ABSTRACT
This paper describes a general model that subsumes many parametric models for continuous data. The model comprises hidden layers of state-space or dynamic causal models, arranged so that the output of one provides input to another. The ensuing hierarchy furnishes a model for many types of data, of arbitrary complexity. Special cases range from the general linear model for static data to generalised convolution models, with system noise, for nonlinear time-series analysis. Crucially, all of these models can be inverted using exactly the same scheme, namely, dynamic expectation maximization. This means that a single model and optimisation scheme can be used to invert a wide range of models. We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data. We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain.

Show MeSH
This schematic shows the linear convolution model used in thesubsequent figure in terms of a directed Bayesian graph.In this model, a simple Gaussian ‘bump’ functionacts as a cause to perturb two coupled hidden states. Their dynamicsare then projected to four response variables, whose time-coursesare cartooned on the left. This figure also summarises thearchitecture of the implicit inversion scheme (right), in whichprecision-weighted prediction errors drive the conditional modes tooptimise variational action. Critically, the prediction errorspropagate their effects up the hierarchy (c.f., Bayesian beliefpropagation or message passing), whereas the predictions are passeddown the hierarchy. This sort of scheme can be implemented easily inneural networks (see last section and [5] for aneurobiological treatment). This generative model uses a singlecause v(1), two dynamic states  and four outputsy1,…,y4.The lines denote the dependencies of the variables on each other,summarised by the equations (in this example both the equations weresimple linear mappings). This is effectively a linear convolutionmodel, mapping one cause to four outputs, which form the inputs tothe recognition model (solid arrow). The inputs to the four data orsensory channels are also shown as an image in the insert.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2570625&req=5

pcbi-1000211-g005: This schematic shows the linear convolution model used in thesubsequent figure in terms of a directed Bayesian graph.In this model, a simple Gaussian ‘bump’ functionacts as a cause to perturb two coupled hidden states. Their dynamicsare then projected to four response variables, whose time-coursesare cartooned on the left. This figure also summarises thearchitecture of the implicit inversion scheme (right), in whichprecision-weighted prediction errors drive the conditional modes tooptimise variational action. Critically, the prediction errorspropagate their effects up the hierarchy (c.f., Bayesian beliefpropagation or message passing), whereas the predictions are passeddown the hierarchy. This sort of scheme can be implemented easily inneural networks (see last section and [5] for aneurobiological treatment). This generative model uses a singlecause v(1), two dynamic states and four outputsy1,…,y4.The lines denote the dependencies of the variables on each other,summarised by the equations (in this example both the equations weresimple linear mappings). This is effectively a linear convolutionmodel, mapping one cause to four outputs, which form the inputs tothe recognition model (solid arrow). The inputs to the four data orsensory channels are also shown as an image in the insert.

Mentions: In this model, causes or inputs perturb the hidden states, which decayexponentially to produce an output that is a linear mixture of hiddenstates. Our example used a single input, two hidden states and four outputs.To generate data, we used a deterministic Gaussian bump function inputv(1) = exp(1/4(t−12)2)and the following parameters(50)During inversion, the cause is unknown and was subject tomildly informative (zero mean and unit precision) shrinkage priors. We alsotreated two of the parameters as unknown; one parameter from the observationfunction (the first) and one from the state equation (the second). Theseparameters had true values of 0.125 and −0.5, respectively, anduninformative shrinkage priors. The priors on the hyperparameters, sometimesreferred to as hyperpriors were similarly uninformative. These Gaussianhyperpriors effectively place lognormal hyperpriors on the precisions(strictly speaking, this invalidates the assumption of a linearhyperparameterisation but the effects are numerically small), because theprecisions scale as exp(λz) andexp(λw). Figure 5 shows a schematic of thegenerative model and the implicit recognition scheme based on predictionerrors. This scheme can be regarded as a message passing scheme that isconsidered in more depth in the next section.


Hierarchical models in the brain.

Friston K - PLoS Comput. Biol. (2008)

This schematic shows the linear convolution model used in thesubsequent figure in terms of a directed Bayesian graph.In this model, a simple Gaussian ‘bump’ functionacts as a cause to perturb two coupled hidden states. Their dynamicsare then projected to four response variables, whose time-coursesare cartooned on the left. This figure also summarises thearchitecture of the implicit inversion scheme (right), in whichprecision-weighted prediction errors drive the conditional modes tooptimise variational action. Critically, the prediction errorspropagate their effects up the hierarchy (c.f., Bayesian beliefpropagation or message passing), whereas the predictions are passeddown the hierarchy. This sort of scheme can be implemented easily inneural networks (see last section and [5] for aneurobiological treatment). This generative model uses a singlecause v(1), two dynamic states  and four outputsy1,…,y4.The lines denote the dependencies of the variables on each other,summarised by the equations (in this example both the equations weresimple linear mappings). This is effectively a linear convolutionmodel, mapping one cause to four outputs, which form the inputs tothe recognition model (solid arrow). The inputs to the four data orsensory channels are also shown as an image in the insert.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2570625&req=5

pcbi-1000211-g005: This schematic shows the linear convolution model used in thesubsequent figure in terms of a directed Bayesian graph.In this model, a simple Gaussian ‘bump’ functionacts as a cause to perturb two coupled hidden states. Their dynamicsare then projected to four response variables, whose time-coursesare cartooned on the left. This figure also summarises thearchitecture of the implicit inversion scheme (right), in whichprecision-weighted prediction errors drive the conditional modes tooptimise variational action. Critically, the prediction errorspropagate their effects up the hierarchy (c.f., Bayesian beliefpropagation or message passing), whereas the predictions are passeddown the hierarchy. This sort of scheme can be implemented easily inneural networks (see last section and [5] for aneurobiological treatment). This generative model uses a singlecause v(1), two dynamic states and four outputsy1,…,y4.The lines denote the dependencies of the variables on each other,summarised by the equations (in this example both the equations weresimple linear mappings). This is effectively a linear convolutionmodel, mapping one cause to four outputs, which form the inputs tothe recognition model (solid arrow). The inputs to the four data orsensory channels are also shown as an image in the insert.
Mentions: In this model, causes or inputs perturb the hidden states, which decayexponentially to produce an output that is a linear mixture of hiddenstates. Our example used a single input, two hidden states and four outputs.To generate data, we used a deterministic Gaussian bump function inputv(1) = exp(1/4(t−12)2)and the following parameters(50)During inversion, the cause is unknown and was subject tomildly informative (zero mean and unit precision) shrinkage priors. We alsotreated two of the parameters as unknown; one parameter from the observationfunction (the first) and one from the state equation (the second). Theseparameters had true values of 0.125 and −0.5, respectively, anduninformative shrinkage priors. The priors on the hyperparameters, sometimesreferred to as hyperpriors were similarly uninformative. These Gaussianhyperpriors effectively place lognormal hyperpriors on the precisions(strictly speaking, this invalidates the assumption of a linearhyperparameterisation but the effects are numerically small), because theprecisions scale as exp(λz) andexp(λw). Figure 5 shows a schematic of thegenerative model and the implicit recognition scheme based on predictionerrors. This scheme can be regarded as a message passing scheme that isconsidered in more depth in the next section.

Bottom Line: This means that a single model and optimisation scheme can be used to invert a wide range of models.We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data.We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain.

View Article: PubMed Central - PubMed

Affiliation: The Wellcome Trust Centre of Neuroimaging, University College London, London, United Kingdom. k.friston@fil.ion.ucl.ac.uk

ABSTRACT
This paper describes a general model that subsumes many parametric models for continuous data. The model comprises hidden layers of state-space or dynamic causal models, arranged so that the output of one provides input to another. The ensuing hierarchy furnishes a model for many types of data, of arbitrary complexity. Special cases range from the general linear model for static data to generalised convolution models, with system noise, for nonlinear time-series analysis. Crucially, all of these models can be inverted using exactly the same scheme, namely, dynamic expectation maximization. This means that a single model and optimisation scheme can be used to invert a wide range of models. We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data. We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain.

Show MeSH