Limits...
Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs.

Elkon R, Agami R - PLoS Comput. Biol. (2008)

Bottom Line: Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology.The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes.Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets.

View Article: PubMed Central - PubMed

Affiliation: Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

ABSTRACT
Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

Show MeSH
Relationship between 3′-UTR AU content and gene response during HPC differentiation.Expression profiles were measured at several time points after stimulation of HPC differentiation into megakaryocytes. To visualize the relationships between 3′-UTR AU content and gene response, the genes were sorted for each time point according to their fold of repression/induction relative to the expression level at t0, and the mean 3′-UTR AU content was calculated in a sliding window that encompassed in each step 5% of the genes included in the analysis. (At each step the sliding window was moved to the right by 5% of its size.) Each plot corresponds to the time point indicated above it. Genes are sorted on the X-axis according to their response, from the most repressed genes at the left to the most induced genes at the right. The Y-axis represents the mean 3′-UTR AU content calculated on each sliding window. The p value above each plot is for the comparison (Wilcoxon test) between the 3′-UTR AU content of the top 5% (most strongly up-regulated) and bottom 5% (most strongly down-regulated) genes at the corresponding time point. Note the striking relationship between 3′-UTR AU content and gene response at the 16 hr time point.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2533120&req=5

pcbi-1000189-g001: Relationship between 3′-UTR AU content and gene response during HPC differentiation.Expression profiles were measured at several time points after stimulation of HPC differentiation into megakaryocytes. To visualize the relationships between 3′-UTR AU content and gene response, the genes were sorted for each time point according to their fold of repression/induction relative to the expression level at t0, and the mean 3′-UTR AU content was calculated in a sliding window that encompassed in each step 5% of the genes included in the analysis. (At each step the sliding window was moved to the right by 5% of its size.) Each plot corresponds to the time point indicated above it. Genes are sorted on the X-axis according to their response, from the most repressed genes at the left to the most induced genes at the right. The Y-axis represents the mean 3′-UTR AU content calculated on each sliding window. The p value above each plot is for the comparison (Wilcoxon test) between the 3′-UTR AU content of the top 5% (most strongly up-regulated) and bottom 5% (most strongly down-regulated) genes at the corresponding time point. Note the striking relationship between 3′-UTR AU content and gene response at the 16 hr time point.

Mentions: Therefore, as a first step in the analysis of the HPC dataset, we checked whether a global 3′-UTR base composition trend is associated with the multi-lineage differentiation. We detected a very strong correlation between 3′-UTR base composition and gene response at several time points in this dataset. For example, there was an exceptionally strong relationship between AU content and gene response at the 16 hr time point after induction of HPC differentiation into megakaryocytes: 3′-UTRs of down-regulated genes were significantly more AU-rich than those of up-regulated ones (Figure 1). (The mean 3′-UTR AU content of the 5% most down-regulated and most up-regulated genes were 60.6% and 52.7%, respectively, p<10−99, Wilcoxon test.) The other three lineages in this dataset displayed similarly strong trends (Figure S1).


Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs.

Elkon R, Agami R - PLoS Comput. Biol. (2008)

Relationship between 3′-UTR AU content and gene response during HPC differentiation.Expression profiles were measured at several time points after stimulation of HPC differentiation into megakaryocytes. To visualize the relationships between 3′-UTR AU content and gene response, the genes were sorted for each time point according to their fold of repression/induction relative to the expression level at t0, and the mean 3′-UTR AU content was calculated in a sliding window that encompassed in each step 5% of the genes included in the analysis. (At each step the sliding window was moved to the right by 5% of its size.) Each plot corresponds to the time point indicated above it. Genes are sorted on the X-axis according to their response, from the most repressed genes at the left to the most induced genes at the right. The Y-axis represents the mean 3′-UTR AU content calculated on each sliding window. The p value above each plot is for the comparison (Wilcoxon test) between the 3′-UTR AU content of the top 5% (most strongly up-regulated) and bottom 5% (most strongly down-regulated) genes at the corresponding time point. Note the striking relationship between 3′-UTR AU content and gene response at the 16 hr time point.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2533120&req=5

pcbi-1000189-g001: Relationship between 3′-UTR AU content and gene response during HPC differentiation.Expression profiles were measured at several time points after stimulation of HPC differentiation into megakaryocytes. To visualize the relationships between 3′-UTR AU content and gene response, the genes were sorted for each time point according to their fold of repression/induction relative to the expression level at t0, and the mean 3′-UTR AU content was calculated in a sliding window that encompassed in each step 5% of the genes included in the analysis. (At each step the sliding window was moved to the right by 5% of its size.) Each plot corresponds to the time point indicated above it. Genes are sorted on the X-axis according to their response, from the most repressed genes at the left to the most induced genes at the right. The Y-axis represents the mean 3′-UTR AU content calculated on each sliding window. The p value above each plot is for the comparison (Wilcoxon test) between the 3′-UTR AU content of the top 5% (most strongly up-regulated) and bottom 5% (most strongly down-regulated) genes at the corresponding time point. Note the striking relationship between 3′-UTR AU content and gene response at the 16 hr time point.
Mentions: Therefore, as a first step in the analysis of the HPC dataset, we checked whether a global 3′-UTR base composition trend is associated with the multi-lineage differentiation. We detected a very strong correlation between 3′-UTR base composition and gene response at several time points in this dataset. For example, there was an exceptionally strong relationship between AU content and gene response at the 16 hr time point after induction of HPC differentiation into megakaryocytes: 3′-UTRs of down-regulated genes were significantly more AU-rich than those of up-regulated ones (Figure 1). (The mean 3′-UTR AU content of the 5% most down-regulated and most up-regulated genes were 60.6% and 52.7%, respectively, p<10−99, Wilcoxon test.) The other three lineages in this dataset displayed similarly strong trends (Figure S1).

Bottom Line: Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology.The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes.Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets.

View Article: PubMed Central - PubMed

Affiliation: Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

ABSTRACT
Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

Show MeSH