Limits...
Probabilistic estimation of microarray data reliability and underlying gene expression.

Bilke S, Breslin T, Sigvardsson M - BMC Bioinformatics (2003)

Bottom Line: The availability of high throughput methods for measurement of mRNA concentrations makes the reliability of conclusions drawn from the data and global quality control of samples and hybridization important issues.The proposed method is effective in determining differential gene expression and sample reliability in replicated microarray data.Already at two discrete expression levels in each sample, it gives a good explanation of the data and is comparable to standard techniques.

View Article: PubMed Central - HTML - PubMed

Affiliation: Complex Systems Division, Department of Theoretical Physics, University of Lund, Sölvegatan 14A, SE-223 62 Lund, Sweden. sven@thep.lu.se

ABSTRACT

Background: The availability of high throughput methods for measurement of mRNA concentrations makes the reliability of conclusions drawn from the data and global quality control of samples and hybridization important issues. We address these issues by an information theoretic approach, applied to discretized expression values in replicated gene expression data.

Results: Our approach yields a quantitative measure of two important parameter classes: First, the probability P(sigma/S) that a gene is in the biological state sigma in a certain variety, given its observed expression S in the samples of that variety. Second, sample specific error probabilities which serve as consistency indicators of the measured samples of each variety. The method and its limitations are tested on gene expression data for developing murine B-cells and a t-test is used as reference. On a set of known genes it performs better than the t-test despite the crude discretization into only two expression levels. The consistency indicators, i.e. the error probabilities, correlate well with variations in the biological material and thus prove efficient.

Conclusions: The proposed method is effective in determining differential gene expression and sample reliability in replicated microarray data. Already at two discrete expression levels in each sample, it gives a good explanation of the data and is comparable to standard techniques.

Show MeSH
Schematic diagram illustrating the transition from underlying to observed distributions of states, in the case of m = 4 samples. The underlying distribution on the left hand side can be described by the probabilities for each underlying state, P(σ1), P(σ0), and P(σr) (see text). This distribution is then distorted by sample specific errors,  and , resulting in an experimentally observed distribution, depicted on the right hand side.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC222958&req=5

Figure 1: Schematic diagram illustrating the transition from underlying to observed distributions of states, in the case of m = 4 samples. The underlying distribution on the left hand side can be described by the probabilities for each underlying state, P(σ1), P(σ0), and P(σr) (see text). This distribution is then distorted by sample specific errors, and , resulting in an experimentally observed distribution, depicted on the right hand side.

Mentions: In this work we take an information theoretic point of view to estimate this probability: The information of interest, the state σ, is "transmitted" in a noisy measurement process and potentially distorted (Figure 1). Using Bayes' theorem, the desired conditional probability Eq. (1) can be expressed as:


Probabilistic estimation of microarray data reliability and underlying gene expression.

Bilke S, Breslin T, Sigvardsson M - BMC Bioinformatics (2003)

Schematic diagram illustrating the transition from underlying to observed distributions of states, in the case of m = 4 samples. The underlying distribution on the left hand side can be described by the probabilities for each underlying state, P(σ1), P(σ0), and P(σr) (see text). This distribution is then distorted by sample specific errors,  and , resulting in an experimentally observed distribution, depicted on the right hand side.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC222958&req=5

Figure 1: Schematic diagram illustrating the transition from underlying to observed distributions of states, in the case of m = 4 samples. The underlying distribution on the left hand side can be described by the probabilities for each underlying state, P(σ1), P(σ0), and P(σr) (see text). This distribution is then distorted by sample specific errors, and , resulting in an experimentally observed distribution, depicted on the right hand side.
Mentions: In this work we take an information theoretic point of view to estimate this probability: The information of interest, the state σ, is "transmitted" in a noisy measurement process and potentially distorted (Figure 1). Using Bayes' theorem, the desired conditional probability Eq. (1) can be expressed as:

Bottom Line: The availability of high throughput methods for measurement of mRNA concentrations makes the reliability of conclusions drawn from the data and global quality control of samples and hybridization important issues.The proposed method is effective in determining differential gene expression and sample reliability in replicated microarray data.Already at two discrete expression levels in each sample, it gives a good explanation of the data and is comparable to standard techniques.

View Article: PubMed Central - HTML - PubMed

Affiliation: Complex Systems Division, Department of Theoretical Physics, University of Lund, Sölvegatan 14A, SE-223 62 Lund, Sweden. sven@thep.lu.se

ABSTRACT

Background: The availability of high throughput methods for measurement of mRNA concentrations makes the reliability of conclusions drawn from the data and global quality control of samples and hybridization important issues. We address these issues by an information theoretic approach, applied to discretized expression values in replicated gene expression data.

Results: Our approach yields a quantitative measure of two important parameter classes: First, the probability P(sigma/S) that a gene is in the biological state sigma in a certain variety, given its observed expression S in the samples of that variety. Second, sample specific error probabilities which serve as consistency indicators of the measured samples of each variety. The method and its limitations are tested on gene expression data for developing murine B-cells and a t-test is used as reference. On a set of known genes it performs better than the t-test despite the crude discretization into only two expression levels. The consistency indicators, i.e. the error probabilities, correlate well with variations in the biological material and thus prove efficient.

Conclusions: The proposed method is effective in determining differential gene expression and sample reliability in replicated microarray data. Already at two discrete expression levels in each sample, it gives a good explanation of the data and is comparable to standard techniques.

Show MeSH