Limits...
Reinforcement learning for adaptive threshold control of restorative brain-computer interfaces: a Bayesian simulation.

Bauer R, Gharabaghi A - Front Neurosci (2015)

Bottom Line: For each feedback iteration, we first determined the thresholds that result in minimal action entropy and maximal instructional efficiency.We then used the resulting vector for the simulation of continuous threshold adaptation.Finally, on the basis of information-theory, we provided an explanation for the achieved benefits of adaptive threshold setting.

View Article: PubMed Central - PubMed

Affiliation: Division of Functional and Restorative Neurosurgery and Division of Translational Neurosurgery, Department of Neurosurgery, Eberhard Karls University Tuebingen Tuebingen, Germany ; Neuroprosthetics Research Group, Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls University Tuebingen Tuebingen, Germany.

ABSTRACT
Restorative brain-computer interfaces (BCI) are increasingly used to provide feedback of neuronal states in a bid to normalize pathological brain activity and achieve behavioral gains. However, patients and healthy subjects alike often show a large variability, or even inability, of brain self-regulation for BCI control, known as BCI illiteracy. Although current co-adaptive algorithms are powerful for assistive BCIs, their inherent class switching clashes with the operant conditioning goal of restorative BCIs. Moreover, due to the treatment rationale, the classifier of restorative BCIs usually has a constrained feature space, thus limiting the possibility of classifier adaptation. In this context, we applied a Bayesian model of neurofeedback and reinforcement learning for different threshold selection strategies to study the impact of threshold adaptation of a linear classifier on optimizing restorative BCIs. For each feedback iteration, we first determined the thresholds that result in minimal action entropy and maximal instructional efficiency. We then used the resulting vector for the simulation of continuous threshold adaptation. We could thus show that threshold adaptation can improve reinforcement learning, particularly in cases of BCI illiteracy. Finally, on the basis of information-theory, we provided an explanation for the achieved benefits of adaptive threshold setting.

No MeSH data available.


(A) is a depiction of the state-action-element fundamental to any neurofeedback environment on the basis of linear discrimination. At any states, the subject selects one of two actions (aF, aT), resulting in a subsequent state step in the opposite direction (aF:false action; aT:trained action). (B) Shows the probability of reward for a given action (blue aT and red aF) as a function of the threshold θ. The dot markers indicate the reward probabilities at different thresholds acquired from a real dataset (a right-handed female subject performing a neurofeedback task based on motor imagery-related β-modulation over sensorimotor regions with contingent haptic feedback, identical to the task described elsewhere, Vukelić et al., 2014). The red and blue traces are logistic functions fitted to the raw data.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4325901&req=5

Figure 1: (A) is a depiction of the state-action-element fundamental to any neurofeedback environment on the basis of linear discrimination. At any states, the subject selects one of two actions (aF, aT), resulting in a subsequent state step in the opposite direction (aF:false action; aT:trained action). (B) Shows the probability of reward for a given action (blue aT and red aF) as a function of the threshold θ. The dot markers indicate the reward probabilities at different thresholds acquired from a real dataset (a right-handed female subject performing a neurofeedback task based on motor imagery-related β-modulation over sensorimotor regions with contingent haptic feedback, identical to the task described elsewhere, Vukelić et al., 2014). The red and blue traces are logistic functions fitted to the raw data.

Mentions: In addition, the state space is usually not discrete, but continuous. By including a parameter (δ) for the step size of one action, a continuous state space can be modeled. Assuming that the step size for both actions is equal but that it is taken in different directions, the current state position (σ) in this continuum can be calculated as the number of times the trained action is chosen instead of the false action, i.e., σ = nδ-mδ. The trained action moves the subject one step toward the trained state, whereas the false action moves the subject one step toward the false state (see Figure 1A). This enables us to set a threshold (θ) in the state continuum to determine the probability of reward for the trained action P(r/aT) and for the false action P(r/aF).


Reinforcement learning for adaptive threshold control of restorative brain-computer interfaces: a Bayesian simulation.

Bauer R, Gharabaghi A - Front Neurosci (2015)

(A) is a depiction of the state-action-element fundamental to any neurofeedback environment on the basis of linear discrimination. At any states, the subject selects one of two actions (aF, aT), resulting in a subsequent state step in the opposite direction (aF:false action; aT:trained action). (B) Shows the probability of reward for a given action (blue aT and red aF) as a function of the threshold θ. The dot markers indicate the reward probabilities at different thresholds acquired from a real dataset (a right-handed female subject performing a neurofeedback task based on motor imagery-related β-modulation over sensorimotor regions with contingent haptic feedback, identical to the task described elsewhere, Vukelić et al., 2014). The red and blue traces are logistic functions fitted to the raw data.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4325901&req=5

Figure 1: (A) is a depiction of the state-action-element fundamental to any neurofeedback environment on the basis of linear discrimination. At any states, the subject selects one of two actions (aF, aT), resulting in a subsequent state step in the opposite direction (aF:false action; aT:trained action). (B) Shows the probability of reward for a given action (blue aT and red aF) as a function of the threshold θ. The dot markers indicate the reward probabilities at different thresholds acquired from a real dataset (a right-handed female subject performing a neurofeedback task based on motor imagery-related β-modulation over sensorimotor regions with contingent haptic feedback, identical to the task described elsewhere, Vukelić et al., 2014). The red and blue traces are logistic functions fitted to the raw data.
Mentions: In addition, the state space is usually not discrete, but continuous. By including a parameter (δ) for the step size of one action, a continuous state space can be modeled. Assuming that the step size for both actions is equal but that it is taken in different directions, the current state position (σ) in this continuum can be calculated as the number of times the trained action is chosen instead of the false action, i.e., σ = nδ-mδ. The trained action moves the subject one step toward the trained state, whereas the false action moves the subject one step toward the false state (see Figure 1A). This enables us to set a threshold (θ) in the state continuum to determine the probability of reward for the trained action P(r/aT) and for the false action P(r/aF).

Bottom Line: For each feedback iteration, we first determined the thresholds that result in minimal action entropy and maximal instructional efficiency.We then used the resulting vector for the simulation of continuous threshold adaptation.Finally, on the basis of information-theory, we provided an explanation for the achieved benefits of adaptive threshold setting.

View Article: PubMed Central - PubMed

Affiliation: Division of Functional and Restorative Neurosurgery and Division of Translational Neurosurgery, Department of Neurosurgery, Eberhard Karls University Tuebingen Tuebingen, Germany ; Neuroprosthetics Research Group, Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls University Tuebingen Tuebingen, Germany.

ABSTRACT
Restorative brain-computer interfaces (BCI) are increasingly used to provide feedback of neuronal states in a bid to normalize pathological brain activity and achieve behavioral gains. However, patients and healthy subjects alike often show a large variability, or even inability, of brain self-regulation for BCI control, known as BCI illiteracy. Although current co-adaptive algorithms are powerful for assistive BCIs, their inherent class switching clashes with the operant conditioning goal of restorative BCIs. Moreover, due to the treatment rationale, the classifier of restorative BCIs usually has a constrained feature space, thus limiting the possibility of classifier adaptation. In this context, we applied a Bayesian model of neurofeedback and reinforcement learning for different threshold selection strategies to study the impact of threshold adaptation of a linear classifier on optimizing restorative BCIs. For each feedback iteration, we first determined the thresholds that result in minimal action entropy and maximal instructional efficiency. We then used the resulting vector for the simulation of continuous threshold adaptation. We could thus show that threshold adaptation can improve reinforcement learning, particularly in cases of BCI illiteracy. Finally, on the basis of information-theory, we provided an explanation for the achieved benefits of adaptive threshold setting.

No MeSH data available.