Limits...
Effect of RNA quality on transcript intensity levels in microarray analysis of human post-mortem brain tissues.

Popova T, Mennerich D, Weith A, Quast K - BMC Genomics (2008)

Bottom Line: Systematic components affect the expressed transcripts by introducing irrelevant gene correlations and can strongly influence the results of the main experiment.A linear model correcting the effect of RNA quality on measured intensities was introduced.Basic conclusions for data analysis in expression profiling study are as follows: 1) testing for RNA quality dependency should be included in the preprocessing of the data; 2) investigating inter-gene correlation without regard to RNA quality effects could be misleading; 3) data normalization procedures relying on housekeeping genes either do not influence the correlation structure (if 3'-end intensities are used) or increase it for negatively correlated transcripts (if 5'-end or median intensities are included in normalization procedure); 4) sample sets should be matched with regard to RNA quality; 5) RMA preprocessing is more sensitive to RNA quality effect, than MAS 5.0.

View Article: PubMed Central - HTML - PubMed

Affiliation: Boehringer Ingelheim Pharma GmbH Co & KG, Birkendorfer Str. 65, Biberach and der Riss, Germany. tatiana.popova@boehringer-ingelheim.com

ABSTRACT

Background: Large-scale gene expression analysis of post-mortem brain tissue offers unique opportunities for investigating genetic mechanisms of psychiatric and neurodegenerative disorders. On the other hand microarray data analysis associated with these studies is a challenging task. In this publication we address the issue of low RNA quality data and corresponding data analysis strategies.

Results: A detailed analysis of effects of post chip RNA quality on the measured abundance of transcripts is presented. Overall Affymetrix GeneChip data (HG-U133_AB and HG-U133_Plus_2.0) derived from ten different brain regions was investigated. Post chip RNA quality being assessed by 5'/3' ratio of housekeeping genes was found to introduce a well pronounced systematic noise into the measured transcript expression levels. According to this study RNA quality effects have: 1) a "random" component which is introduced by the technology and 2) a systematic component which depends on the features of the transcripts and probes. Random components mainly account for numerous negative correlations of low-abundant transcripts. These negative correlations are not reproducible and are mainly introduced by an increased relative level of noise. Three major contributors to the systematic noise component were identified: the first is the probe set distribution, the second is the length of mRNA species, and the third is the stability of mRNA species. Positive correlations reflect the 5'-end to 3'-end direction of mRNA degradation whereas negative correlations result from the compensatory increase in stable and 3'-end probed transcripts. Systematic components affect the expressed transcripts by introducing irrelevant gene correlations and can strongly influence the results of the main experiment. A linear model correcting the effect of RNA quality on measured intensities was introduced. In addition the contribution of a number of pre-mortem and post-mortem attributes to the overall detected RNA quality effect was investigated. Brain pH, duration of agonal stage, post-mortem interval before sampling and donor's age of death within considered limits were found to have no significant contribution.

Conclusion: Basic conclusions for data analysis in expression profiling study are as follows: 1) testing for RNA quality dependency should be included in the preprocessing of the data; 2) investigating inter-gene correlation without regard to RNA quality effects could be misleading; 3) data normalization procedures relying on housekeeping genes either do not influence the correlation structure (if 3'-end intensities are used) or increase it for negatively correlated transcripts (if 5'-end or median intensities are included in normalization procedure); 4) sample sets should be matched with regard to RNA quality; 5) RMA preprocessing is more sensitive to RNA quality effect, than MAS 5.0.

Show MeSH

Related in: MedlinePlus

Number of significant correlations to beta actin ratio within chip set. Number of transcripts that show significant (<0.05) correlation to beta actin ratio in all considered sample sets for two chip platforms and two normalizations: a) HG-U133_plus2.0 chip, b) HG-U133_AB chip. Horizontal line indicates a number of significant hits expected by chance. They are: 5% of 54613 ≅ 2731 for HG-U133_Plus2.0 chip and 5% of 44792 ≅ 2240 for HG-U133_AB chip.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2268927&req=5

Figure 3: Number of significant correlations to beta actin ratio within chip set. Number of transcripts that show significant (<0.05) correlation to beta actin ratio in all considered sample sets for two chip platforms and two normalizations: a) HG-U133_plus2.0 chip, b) HG-U133_AB chip. Horizontal line indicates a number of significant hits expected by chance. They are: 5% of 54613 ≅ 2731 for HG-U133_Plus2.0 chip and 5% of 44792 ≅ 2240 for HG-U133_AB chip.

Mentions: Figure 3 shows the number of transcripts exhibiting expression profiles significantly (p ≤ 0.05) correlated to the beta actin ratio in all sample sets under consideration. In all considered cases the number of RNA quality dependent transcripts is higher than expected by chance (horizontal line in Figure 3 corresponds to 5% of all transcripts represented on the chip). The fact that up to 30% of all transcripts display expression profiles correlated to the beta actin ratio implies that RNA quality can act as a major source for the previously reported correlation structure in microarray data [28]. refRMA normalized data seems to be more sensitive to the RNA quality effect compared to MAS 5.0 data (Figure 3).


Effect of RNA quality on transcript intensity levels in microarray analysis of human post-mortem brain tissues.

Popova T, Mennerich D, Weith A, Quast K - BMC Genomics (2008)

Number of significant correlations to beta actin ratio within chip set. Number of transcripts that show significant (<0.05) correlation to beta actin ratio in all considered sample sets for two chip platforms and two normalizations: a) HG-U133_plus2.0 chip, b) HG-U133_AB chip. Horizontal line indicates a number of significant hits expected by chance. They are: 5% of 54613 ≅ 2731 for HG-U133_Plus2.0 chip and 5% of 44792 ≅ 2240 for HG-U133_AB chip.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2268927&req=5

Figure 3: Number of significant correlations to beta actin ratio within chip set. Number of transcripts that show significant (<0.05) correlation to beta actin ratio in all considered sample sets for two chip platforms and two normalizations: a) HG-U133_plus2.0 chip, b) HG-U133_AB chip. Horizontal line indicates a number of significant hits expected by chance. They are: 5% of 54613 ≅ 2731 for HG-U133_Plus2.0 chip and 5% of 44792 ≅ 2240 for HG-U133_AB chip.
Mentions: Figure 3 shows the number of transcripts exhibiting expression profiles significantly (p ≤ 0.05) correlated to the beta actin ratio in all sample sets under consideration. In all considered cases the number of RNA quality dependent transcripts is higher than expected by chance (horizontal line in Figure 3 corresponds to 5% of all transcripts represented on the chip). The fact that up to 30% of all transcripts display expression profiles correlated to the beta actin ratio implies that RNA quality can act as a major source for the previously reported correlation structure in microarray data [28]. refRMA normalized data seems to be more sensitive to the RNA quality effect compared to MAS 5.0 data (Figure 3).

Bottom Line: Systematic components affect the expressed transcripts by introducing irrelevant gene correlations and can strongly influence the results of the main experiment.A linear model correcting the effect of RNA quality on measured intensities was introduced.Basic conclusions for data analysis in expression profiling study are as follows: 1) testing for RNA quality dependency should be included in the preprocessing of the data; 2) investigating inter-gene correlation without regard to RNA quality effects could be misleading; 3) data normalization procedures relying on housekeeping genes either do not influence the correlation structure (if 3'-end intensities are used) or increase it for negatively correlated transcripts (if 5'-end or median intensities are included in normalization procedure); 4) sample sets should be matched with regard to RNA quality; 5) RMA preprocessing is more sensitive to RNA quality effect, than MAS 5.0.

View Article: PubMed Central - HTML - PubMed

Affiliation: Boehringer Ingelheim Pharma GmbH Co & KG, Birkendorfer Str. 65, Biberach and der Riss, Germany. tatiana.popova@boehringer-ingelheim.com

ABSTRACT

Background: Large-scale gene expression analysis of post-mortem brain tissue offers unique opportunities for investigating genetic mechanisms of psychiatric and neurodegenerative disorders. On the other hand microarray data analysis associated with these studies is a challenging task. In this publication we address the issue of low RNA quality data and corresponding data analysis strategies.

Results: A detailed analysis of effects of post chip RNA quality on the measured abundance of transcripts is presented. Overall Affymetrix GeneChip data (HG-U133_AB and HG-U133_Plus_2.0) derived from ten different brain regions was investigated. Post chip RNA quality being assessed by 5'/3' ratio of housekeeping genes was found to introduce a well pronounced systematic noise into the measured transcript expression levels. According to this study RNA quality effects have: 1) a "random" component which is introduced by the technology and 2) a systematic component which depends on the features of the transcripts and probes. Random components mainly account for numerous negative correlations of low-abundant transcripts. These negative correlations are not reproducible and are mainly introduced by an increased relative level of noise. Three major contributors to the systematic noise component were identified: the first is the probe set distribution, the second is the length of mRNA species, and the third is the stability of mRNA species. Positive correlations reflect the 5'-end to 3'-end direction of mRNA degradation whereas negative correlations result from the compensatory increase in stable and 3'-end probed transcripts. Systematic components affect the expressed transcripts by introducing irrelevant gene correlations and can strongly influence the results of the main experiment. A linear model correcting the effect of RNA quality on measured intensities was introduced. In addition the contribution of a number of pre-mortem and post-mortem attributes to the overall detected RNA quality effect was investigated. Brain pH, duration of agonal stage, post-mortem interval before sampling and donor's age of death within considered limits were found to have no significant contribution.

Conclusion: Basic conclusions for data analysis in expression profiling study are as follows: 1) testing for RNA quality dependency should be included in the preprocessing of the data; 2) investigating inter-gene correlation without regard to RNA quality effects could be misleading; 3) data normalization procedures relying on housekeeping genes either do not influence the correlation structure (if 3'-end intensities are used) or increase it for negatively correlated transcripts (if 5'-end or median intensities are included in normalization procedure); 4) sample sets should be matched with regard to RNA quality; 5) RMA preprocessing is more sensitive to RNA quality effect, than MAS 5.0.

Show MeSH
Related in: MedlinePlus