Limits...
The Influence of the Global Gene Expression Shift on Downstream Analyses.

Xu Q, Zhang X - PLoS ONE (2016)

Bottom Line: Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data.To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis.Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

View Article: PubMed Central - PubMed

Affiliation: MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing, China.

ABSTRACT
The assumption that total abundance of RNAs in a cell is roughly the same in different cells is underlying most studies based on gene expression analyses. But experiments have shown that changes in the expression of some master regulators such as c-MYC can cause global shift in the expression of almost all genes in some cell types like cancers. Such shift will violate this assumption and can cause wrong or biased conclusions for standard data analysis practices, such as detection of differentially expressed (DE) genes and molecular classification of tumors based on gene expression. Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data. To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis. We collected data with known global shift effect and also generated data to simulate different situations of the effect based on a wide collection of real gene expression data, and conducted comparative studies on representative existing methods. We observed that some DE analysis methods are more tolerant to the global shift while others are very sensitive to it. Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

No MeSH data available.


Related in: MedlinePlus

Overlap proportions of differentially expressed genes detected by fold-change from the data with corrected and uncorrected global shift effects on Loven et al’s data.(A) Up-regulated DE genes. (B) Down-regulated DE genes. The x-axis is the number of the top genes of the up-regulated DE gene lists or the down-regulated DE gene lists. The y-axis is the overlap proportions of the top genes.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4836657&req=5

pone.0153903.g002: Overlap proportions of differentially expressed genes detected by fold-change from the data with corrected and uncorrected global shift effects on Loven et al’s data.(A) Up-regulated DE genes. (B) Down-regulated DE genes. The x-axis is the number of the top genes of the up-regulated DE gene lists or the down-regulated DE gene lists. The y-axis is the overlap proportions of the top genes.

Mentions: We compared the overlap proportions of the top 50, 100, …, 1000 genes of the detected up-regulated DE gene lists and the down-regulated DE gene lists obtained from the data corrected with spike-in-controls and un-corrected data. The results of fold-change are shown in Fig 2. We can see that the overlap proportion of the up-regulated genes is always high, and the overlap proportion of the down-regulated genes is high for the top few genes, but decreases rapidly when we go down in the list. This is consistent with the understanding that for up-regulated genes, global shift will make them more up-regulated and therefore won’t change the order much. But for down-regulated genes, many of them are actually also up-regulated and the seemingly down-regulation are due to the improper normalization. Therefore, correcting the global shift will cause big change in the list of down-regulated genes.


The Influence of the Global Gene Expression Shift on Downstream Analyses.

Xu Q, Zhang X - PLoS ONE (2016)

Overlap proportions of differentially expressed genes detected by fold-change from the data with corrected and uncorrected global shift effects on Loven et al’s data.(A) Up-regulated DE genes. (B) Down-regulated DE genes. The x-axis is the number of the top genes of the up-regulated DE gene lists or the down-regulated DE gene lists. The y-axis is the overlap proportions of the top genes.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4836657&req=5

pone.0153903.g002: Overlap proportions of differentially expressed genes detected by fold-change from the data with corrected and uncorrected global shift effects on Loven et al’s data.(A) Up-regulated DE genes. (B) Down-regulated DE genes. The x-axis is the number of the top genes of the up-regulated DE gene lists or the down-regulated DE gene lists. The y-axis is the overlap proportions of the top genes.
Mentions: We compared the overlap proportions of the top 50, 100, …, 1000 genes of the detected up-regulated DE gene lists and the down-regulated DE gene lists obtained from the data corrected with spike-in-controls and un-corrected data. The results of fold-change are shown in Fig 2. We can see that the overlap proportion of the up-regulated genes is always high, and the overlap proportion of the down-regulated genes is high for the top few genes, but decreases rapidly when we go down in the list. This is consistent with the understanding that for up-regulated genes, global shift will make them more up-regulated and therefore won’t change the order much. But for down-regulated genes, many of them are actually also up-regulated and the seemingly down-regulation are due to the improper normalization. Therefore, correcting the global shift will cause big change in the list of down-regulated genes.

Bottom Line: Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data.To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis.Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

View Article: PubMed Central - PubMed

Affiliation: MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing, China.

ABSTRACT
The assumption that total abundance of RNAs in a cell is roughly the same in different cells is underlying most studies based on gene expression analyses. But experiments have shown that changes in the expression of some master regulators such as c-MYC can cause global shift in the expression of almost all genes in some cell types like cancers. Such shift will violate this assumption and can cause wrong or biased conclusions for standard data analysis practices, such as detection of differentially expressed (DE) genes and molecular classification of tumors based on gene expression. Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data. To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis. We collected data with known global shift effect and also generated data to simulate different situations of the effect based on a wide collection of real gene expression data, and conducted comparative studies on representative existing methods. We observed that some DE analysis methods are more tolerant to the global shift while others are very sensitive to it. Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

No MeSH data available.


Related in: MedlinePlus