Limits...
Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs.

Elkon R, Agami R - PLoS Comput. Biol. (2008)

Bottom Line: Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology.The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes.Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets.

View Article: PubMed Central - PubMed

Affiliation: Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

ABSTRACT
Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

Show MeSH
AU bias in the miR-155 dataset.Relationship between 3′-UTR AU content and gene response in the dataset that compared gene expression profiles between miR-155-deficient and control Th2 cells. (A) Without AU normalization. (B) After applying AU normalization to the dataset. Plots were generated as described in the legend to Figure 1.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2533120&req=5

pcbi-1000189-g006: AU bias in the miR-155 dataset.Relationship between 3′-UTR AU content and gene response in the dataset that compared gene expression profiles between miR-155-deficient and control Th2 cells. (A) Without AU normalization. (B) After applying AU normalization to the dataset. Plots were generated as described in the legend to Figure 1.

Mentions: We next searched for an expression dataset that would serve as a positive test case; that is, a dataset that contains known miR signals. We preferred physiologically relevant datasets over ones that over-expressed miRs, which often give expression levels that are far above physiological ones. (Statistical searches for active miRs applied to several datasets that profiled cells over-expressing specific miRs readily detected the correct signals both without and after AU normalization (data not shown).) A recent study that compared expression profiles between stimulated T-cells derived from miR-155 deficient and control mice met this requirement [23]. As in many other datasets, we observed a strong AU bias in this dataset too, and removed it using the AU normalization (Figure 6). Without AU normalization, the statistical tests identified eleven significant miR families; the true hit (miR-155) was the third most significant one (Table 2). (Note that five out of the six most significant miRs falsely identified on the negative dataset were detected also in this positive dataset (compare Tables 1 and 2)). Here too, permutation tests found, in most cases, random seeds whose significance scores were similar to the ones obtained by the original seeds (Table 2). In sharp contrast, after AU normalization, only the true miR (miR-155) was detected and its statistical significance was substantially improved (Table 2). Importantly, none of the permuted seeds derived from the seed of miR-155 obtained a statistically significant score.


Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs.

Elkon R, Agami R - PLoS Comput. Biol. (2008)

AU bias in the miR-155 dataset.Relationship between 3′-UTR AU content and gene response in the dataset that compared gene expression profiles between miR-155-deficient and control Th2 cells. (A) Without AU normalization. (B) After applying AU normalization to the dataset. Plots were generated as described in the legend to Figure 1.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2533120&req=5

pcbi-1000189-g006: AU bias in the miR-155 dataset.Relationship between 3′-UTR AU content and gene response in the dataset that compared gene expression profiles between miR-155-deficient and control Th2 cells. (A) Without AU normalization. (B) After applying AU normalization to the dataset. Plots were generated as described in the legend to Figure 1.
Mentions: We next searched for an expression dataset that would serve as a positive test case; that is, a dataset that contains known miR signals. We preferred physiologically relevant datasets over ones that over-expressed miRs, which often give expression levels that are far above physiological ones. (Statistical searches for active miRs applied to several datasets that profiled cells over-expressing specific miRs readily detected the correct signals both without and after AU normalization (data not shown).) A recent study that compared expression profiles between stimulated T-cells derived from miR-155 deficient and control mice met this requirement [23]. As in many other datasets, we observed a strong AU bias in this dataset too, and removed it using the AU normalization (Figure 6). Without AU normalization, the statistical tests identified eleven significant miR families; the true hit (miR-155) was the third most significant one (Table 2). (Note that five out of the six most significant miRs falsely identified on the negative dataset were detected also in this positive dataset (compare Tables 1 and 2)). Here too, permutation tests found, in most cases, random seeds whose significance scores were similar to the ones obtained by the original seeds (Table 2). In sharp contrast, after AU normalization, only the true miR (miR-155) was detected and its statistical significance was substantially improved (Table 2). Importantly, none of the permuted seeds derived from the seed of miR-155 obtained a statistically significant score.

Bottom Line: Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology.The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes.Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets.

View Article: PubMed Central - PubMed

Affiliation: Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

ABSTRACT
Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

Show MeSH