Limits...
Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs.

Elkon R, Agami R - PLoS Comput. Biol. (2008)

Bottom Line: Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology.The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes.Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets.

View Article: PubMed Central - PubMed

Affiliation: Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

ABSTRACT
Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

Show MeSH
Strong relationship between 3′-UTR AU content and gene response detected in a comparison between technical replicates.The figure shows the relationship between 3′-UTR AU content and gene fold-change in a comparison between two chips hybridized with identical universal reference RNA pools. The plot was generated as described in the legend to Figure 1. A highly significant relationship between 3′-UTR AU content and gene response was detected in this technical comparison (p value = 8.1*10−84 for the comparison between the bottom and top 5% ‘responding’ genes), pointing to a major AU bias in microarray measurements.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2533120&req=5

pcbi-1000189-g002: Strong relationship between 3′-UTR AU content and gene response detected in a comparison between technical replicates.The figure shows the relationship between 3′-UTR AU content and gene fold-change in a comparison between two chips hybridized with identical universal reference RNA pools. The plot was generated as described in the legend to Figure 1. A highly significant relationship between 3′-UTR AU content and gene response was detected in this technical comparison (p value = 8.1*10−84 for the comparison between the bottom and top 5% ‘responding’ genes), pointing to a major AU bias in microarray measurements.

Mentions: The strength of the relationship between 3′-UTR AU content and gene response in the HPC dataset prompted us to search for such trends in other datasets. Surprisingly, we found such relationships, with similarly high statistical significance, in numerous microarray datasets (data not shown). Still more suspicious, we observed the relationship even when we compared different control samples within a dataset. This led us to question whether the relationship observed between 3′-UTR AU content and gene response reflects any true biological regulatory mechanism, or is rather a result of some technical artifact in microarray measurements. We found a definitive answer to this question by analyzing a technical dataset published by van Ruissen et al. [20]. This dataset profiled a universal reference RNA pool in two independent oligonucleotide chips (Affymetrix HGU133A). Comparing the data from these two arrays, which measure identical and artificial RNA pools, we again found a striking relationship between 3′-UTR AU content and difference in gene expression level (Figure 2), pointing to a major AU bias in microarray measurements. This AU response bias is not specific to a particular data preprocessing method, as it existed in data under different preprocessing and normalization schemes; namely, rma, gcrma, and mas5 (Figure S2). In this technical dataset, we detected no preference for A or U in the bias, and no major 3′-UTR length bias (Figure S3).


Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs.

Elkon R, Agami R - PLoS Comput. Biol. (2008)

Strong relationship between 3′-UTR AU content and gene response detected in a comparison between technical replicates.The figure shows the relationship between 3′-UTR AU content and gene fold-change in a comparison between two chips hybridized with identical universal reference RNA pools. The plot was generated as described in the legend to Figure 1. A highly significant relationship between 3′-UTR AU content and gene response was detected in this technical comparison (p value = 8.1*10−84 for the comparison between the bottom and top 5% ‘responding’ genes), pointing to a major AU bias in microarray measurements.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2533120&req=5

pcbi-1000189-g002: Strong relationship between 3′-UTR AU content and gene response detected in a comparison between technical replicates.The figure shows the relationship between 3′-UTR AU content and gene fold-change in a comparison between two chips hybridized with identical universal reference RNA pools. The plot was generated as described in the legend to Figure 1. A highly significant relationship between 3′-UTR AU content and gene response was detected in this technical comparison (p value = 8.1*10−84 for the comparison between the bottom and top 5% ‘responding’ genes), pointing to a major AU bias in microarray measurements.
Mentions: The strength of the relationship between 3′-UTR AU content and gene response in the HPC dataset prompted us to search for such trends in other datasets. Surprisingly, we found such relationships, with similarly high statistical significance, in numerous microarray datasets (data not shown). Still more suspicious, we observed the relationship even when we compared different control samples within a dataset. This led us to question whether the relationship observed between 3′-UTR AU content and gene response reflects any true biological regulatory mechanism, or is rather a result of some technical artifact in microarray measurements. We found a definitive answer to this question by analyzing a technical dataset published by van Ruissen et al. [20]. This dataset profiled a universal reference RNA pool in two independent oligonucleotide chips (Affymetrix HGU133A). Comparing the data from these two arrays, which measure identical and artificial RNA pools, we again found a striking relationship between 3′-UTR AU content and difference in gene expression level (Figure 2), pointing to a major AU bias in microarray measurements. This AU response bias is not specific to a particular data preprocessing method, as it existed in data under different preprocessing and normalization schemes; namely, rma, gcrma, and mas5 (Figure S2). In this technical dataset, we detected no preference for A or U in the bias, and no major 3′-UTR length bias (Figure S3).

Bottom Line: Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology.The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes.Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets.

View Article: PubMed Central - PubMed

Affiliation: Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

ABSTRACT
Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

Show MeSH