Limits...
A maximum entropy test for evaluating higher-order correlations in spike counts.

Onken A, Dragoi V, Obermayer K - PLoS Comput. Biol. (2012)

Bottom Line: Applying our test to artificial data shows that the effects of higher-order correlations on these divergence measures can be detected even when the number of samples is small.These results demonstrate that higher-order correlations can matter when estimating information theoretic quantities in V1.They also show that our test is able to detect their presence in typical in-vivo data sets, where the number of samples is too small to estimate higher-order correlations directly.

View Article: PubMed Central - PubMed

Affiliation: Technische Universität Berlin, Berlin, Germany. arno.onken@unige.ch

ABSTRACT
Evaluating the importance of higher-order correlations of neural spike counts has been notoriously hard. A large number of samples are typically required in order to estimate higher-order correlations and resulting information theoretic quantities. In typical electrophysiology data sets with many experimental conditions, however, the number of samples in each condition is rather small. Here we describe a method that allows to quantify evidence for higher-order correlations in exactly these cases. We construct a family of reference distributions: maximum entropy distributions, which are constrained only by marginals and by linear correlations as quantified by the Pearson correlation coefficient. We devise a Monte Carlo goodness-of-fit test, which tests--for a given divergence measure of interest--whether the experimental data lead to the rejection of the hypothesis that it was generated by one of the reference distributions. Applying our test to artificial data shows that the effects of higher-order correlations on these divergence measures can be detected even when the number of samples is small. Subsequently, we apply our method to spike count data which were recorded with multielectrode arrays from the primary visual cortex of anesthetized cat during an adaptation experiment. Using mutual information as a divergence measure we find that there are spike count bin sizes at which the maximum entropy hypothesis can be rejected for a substantial number of neuronal pairs. These results demonstrate that higher-order correlations can matter when estimating information theoretic quantities in V1. They also show that our test is able to detect their presence in typical in-vivo data sets, where the number of samples is too small to estimate higher-order correlations directly.

Show MeSH

Related in: MedlinePlus

Evaluation of the maximum entropy test on artificial data.(A) Probability mass functions of the maximum entropy distribution  (left) and the Gaussian copula based distribution  (right) of the mixture distribution  with a linear correlation coefficient of . The Poisson marginals () are plotted along the axes. (B) Same as A but for the mixture distribution  with a linear correlation coefficient of . (C) Percent rejections of the  hypothesis using the entropy difference as the divergence measure. Significance level was . Rejection rates were estimated over 100 tests. Different lines correspond to different numbers of samples drawn from the candidate distribution: 10 (red dotted line), 50 (green dash-dotted line), 100 (blue dashed line), and 200 (black solid line). (Left) Results for the  family for varying correlation coefficient . (Center) Results for distributions from the  family () for varying mixture parameter  (cf. Figure 2 A). (Right) Same for  (, cf. Figure 2 B)). Poisson rate was  for all candidate distributions (corresponding to 30 Hz and 100 ms bins). Simulated annealing [45] was applied to maximize the p-value (cf. Text S1). Number  of Monte Carlo samples was 1000.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3369943&req=5

pcbi-1002539-g002: Evaluation of the maximum entropy test on artificial data.(A) Probability mass functions of the maximum entropy distribution (left) and the Gaussian copula based distribution (right) of the mixture distribution with a linear correlation coefficient of . The Poisson marginals () are plotted along the axes. (B) Same as A but for the mixture distribution with a linear correlation coefficient of . (C) Percent rejections of the hypothesis using the entropy difference as the divergence measure. Significance level was . Rejection rates were estimated over 100 tests. Different lines correspond to different numbers of samples drawn from the candidate distribution: 10 (red dotted line), 50 (green dash-dotted line), 100 (blue dashed line), and 200 (black solid line). (Left) Results for the family for varying correlation coefficient . (Center) Results for distributions from the family () for varying mixture parameter (cf. Figure 2 A). (Right) Same for (, cf. Figure 2 B)). Poisson rate was for all candidate distributions (corresponding to 30 Hz and 100 ms bins). Simulated annealing [45] was applied to maximize the p-value (cf. Text S1). Number of Monte Carlo samples was 1000.

Mentions: The maximum entropy distributions ME served as reference distributions and were constructed according to Equations 6–9 and varying correlation coefficient. In order to investigate the power of the test we constructed two distribution families M1 and M2, which included higher-order correlations. These families are based on so-called copulas, which allow us to construct multivariate distributions with Poisson marginals and higher-order correlations [23], [36]. The families M1 and M2 consisted of two components:(21)where is a mixture parameter, are spike counts. The two components were defined as a maximum entropy distribution of the family ME (, cf. Equations 6–9 with Poisson marginals as in Equation 20 and Figure 2 A, left, and 2 B, left) and a copula-based distribution (cf. Figure 2 A, right, for M1 and 2 B, right, for M2) which was a mixture distribution by itself and which showed significant higher-order correlations [23]:(22)The cumulative distribution function (CDF) is defined as(23)where the CDF's of the Poisson marginals are defined as(24) is the rate parameter of element . Equations 22 and 23 hold for every copula-based distribution with discrete marginals. Linear and higher-order correlations are specified by the applied copula. The copula of the model is defined as a Gaussian mixture copula(25)The bivariate Gaussian copula family is defined as(26)where is the CDF of the bivariate zero-mean unit-variance normal distribution with correlation coefficient and is the inverse of the CDF of the univariate zero-mean unit-variance Gaussian distribution.


A maximum entropy test for evaluating higher-order correlations in spike counts.

Onken A, Dragoi V, Obermayer K - PLoS Comput. Biol. (2012)

Evaluation of the maximum entropy test on artificial data.(A) Probability mass functions of the maximum entropy distribution  (left) and the Gaussian copula based distribution  (right) of the mixture distribution  with a linear correlation coefficient of . The Poisson marginals () are plotted along the axes. (B) Same as A but for the mixture distribution  with a linear correlation coefficient of . (C) Percent rejections of the  hypothesis using the entropy difference as the divergence measure. Significance level was . Rejection rates were estimated over 100 tests. Different lines correspond to different numbers of samples drawn from the candidate distribution: 10 (red dotted line), 50 (green dash-dotted line), 100 (blue dashed line), and 200 (black solid line). (Left) Results for the  family for varying correlation coefficient . (Center) Results for distributions from the  family () for varying mixture parameter  (cf. Figure 2 A). (Right) Same for  (, cf. Figure 2 B)). Poisson rate was  for all candidate distributions (corresponding to 30 Hz and 100 ms bins). Simulated annealing [45] was applied to maximize the p-value (cf. Text S1). Number  of Monte Carlo samples was 1000.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3369943&req=5

pcbi-1002539-g002: Evaluation of the maximum entropy test on artificial data.(A) Probability mass functions of the maximum entropy distribution (left) and the Gaussian copula based distribution (right) of the mixture distribution with a linear correlation coefficient of . The Poisson marginals () are plotted along the axes. (B) Same as A but for the mixture distribution with a linear correlation coefficient of . (C) Percent rejections of the hypothesis using the entropy difference as the divergence measure. Significance level was . Rejection rates were estimated over 100 tests. Different lines correspond to different numbers of samples drawn from the candidate distribution: 10 (red dotted line), 50 (green dash-dotted line), 100 (blue dashed line), and 200 (black solid line). (Left) Results for the family for varying correlation coefficient . (Center) Results for distributions from the family () for varying mixture parameter (cf. Figure 2 A). (Right) Same for (, cf. Figure 2 B)). Poisson rate was for all candidate distributions (corresponding to 30 Hz and 100 ms bins). Simulated annealing [45] was applied to maximize the p-value (cf. Text S1). Number of Monte Carlo samples was 1000.
Mentions: The maximum entropy distributions ME served as reference distributions and were constructed according to Equations 6–9 and varying correlation coefficient. In order to investigate the power of the test we constructed two distribution families M1 and M2, which included higher-order correlations. These families are based on so-called copulas, which allow us to construct multivariate distributions with Poisson marginals and higher-order correlations [23], [36]. The families M1 and M2 consisted of two components:(21)where is a mixture parameter, are spike counts. The two components were defined as a maximum entropy distribution of the family ME (, cf. Equations 6–9 with Poisson marginals as in Equation 20 and Figure 2 A, left, and 2 B, left) and a copula-based distribution (cf. Figure 2 A, right, for M1 and 2 B, right, for M2) which was a mixture distribution by itself and which showed significant higher-order correlations [23]:(22)The cumulative distribution function (CDF) is defined as(23)where the CDF's of the Poisson marginals are defined as(24) is the rate parameter of element . Equations 22 and 23 hold for every copula-based distribution with discrete marginals. Linear and higher-order correlations are specified by the applied copula. The copula of the model is defined as a Gaussian mixture copula(25)The bivariate Gaussian copula family is defined as(26)where is the CDF of the bivariate zero-mean unit-variance normal distribution with correlation coefficient and is the inverse of the CDF of the univariate zero-mean unit-variance Gaussian distribution.

Bottom Line: Applying our test to artificial data shows that the effects of higher-order correlations on these divergence measures can be detected even when the number of samples is small.These results demonstrate that higher-order correlations can matter when estimating information theoretic quantities in V1.They also show that our test is able to detect their presence in typical in-vivo data sets, where the number of samples is too small to estimate higher-order correlations directly.

View Article: PubMed Central - PubMed

Affiliation: Technische Universität Berlin, Berlin, Germany. arno.onken@unige.ch

ABSTRACT
Evaluating the importance of higher-order correlations of neural spike counts has been notoriously hard. A large number of samples are typically required in order to estimate higher-order correlations and resulting information theoretic quantities. In typical electrophysiology data sets with many experimental conditions, however, the number of samples in each condition is rather small. Here we describe a method that allows to quantify evidence for higher-order correlations in exactly these cases. We construct a family of reference distributions: maximum entropy distributions, which are constrained only by marginals and by linear correlations as quantified by the Pearson correlation coefficient. We devise a Monte Carlo goodness-of-fit test, which tests--for a given divergence measure of interest--whether the experimental data lead to the rejection of the hypothesis that it was generated by one of the reference distributions. Applying our test to artificial data shows that the effects of higher-order correlations on these divergence measures can be detected even when the number of samples is small. Subsequently, we apply our method to spike count data which were recorded with multielectrode arrays from the primary visual cortex of anesthetized cat during an adaptation experiment. Using mutual information as a divergence measure we find that there are spike count bin sizes at which the maximum entropy hypothesis can be rejected for a substantial number of neuronal pairs. These results demonstrate that higher-order correlations can matter when estimating information theoretic quantities in V1. They also show that our test is able to detect their presence in typical in-vivo data sets, where the number of samples is too small to estimate higher-order correlations directly.

Show MeSH
Related in: MedlinePlus