Limits...
The Influence of the Global Gene Expression Shift on Downstream Analyses.

Xu Q, Zhang X - PLoS ONE (2016)

Bottom Line: Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data.To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis.Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

View Article: PubMed Central - PubMed

Affiliation: MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing, China.

ABSTRACT
The assumption that total abundance of RNAs in a cell is roughly the same in different cells is underlying most studies based on gene expression analyses. But experiments have shown that changes in the expression of some master regulators such as c-MYC can cause global shift in the expression of almost all genes in some cell types like cancers. Such shift will violate this assumption and can cause wrong or biased conclusions for standard data analysis practices, such as detection of differentially expressed (DE) genes and molecular classification of tumors based on gene expression. Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data. To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis. We collected data with known global shift effect and also generated data to simulate different situations of the effect based on a wide collection of real gene expression data, and conducted comparative studies on representative existing methods. We observed that some DE analysis methods are more tolerant to the global shift while others are very sensitive to it. Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

No MeSH data available.


Related in: MedlinePlus

Overlap proportions of differentially expressed genes detected by fold-change, SAM and t-test from the data with simulated global shift effects, averaged over the 20 datasets.(A) DE genes ranked by whole differentially expressed differences; (B) Up-regulated DE genes; (C) Down-regulated DE genes. The settings are the same with Fig 2.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4836657&req=5

pone.0153903.g004: Overlap proportions of differentially expressed genes detected by fold-change, SAM and t-test from the data with simulated global shift effects, averaged over the 20 datasets.(A) DE genes ranked by whole differentially expressed differences; (B) Up-regulated DE genes; (C) Down-regulated DE genes. The settings are the same with Fig 2.

Mentions: For each of the 20 simulated datasets, we applied the same analysis on the sister datasets and compared the overlap between the top 50, 100, …, 500 genes in the whole differentially expressed gene lists, and also in the separated lists of up-regulated DE genes and down-regulated DE genes. All the results are provided in the S1 File. We averaged overlap proportions on the 20 datasets for the fold-change, t-test and SAM methods. Fig 4 shows the results.


The Influence of the Global Gene Expression Shift on Downstream Analyses.

Xu Q, Zhang X - PLoS ONE (2016)

Overlap proportions of differentially expressed genes detected by fold-change, SAM and t-test from the data with simulated global shift effects, averaged over the 20 datasets.(A) DE genes ranked by whole differentially expressed differences; (B) Up-regulated DE genes; (C) Down-regulated DE genes. The settings are the same with Fig 2.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4836657&req=5

pone.0153903.g004: Overlap proportions of differentially expressed genes detected by fold-change, SAM and t-test from the data with simulated global shift effects, averaged over the 20 datasets.(A) DE genes ranked by whole differentially expressed differences; (B) Up-regulated DE genes; (C) Down-regulated DE genes. The settings are the same with Fig 2.
Mentions: For each of the 20 simulated datasets, we applied the same analysis on the sister datasets and compared the overlap between the top 50, 100, …, 500 genes in the whole differentially expressed gene lists, and also in the separated lists of up-regulated DE genes and down-regulated DE genes. All the results are provided in the S1 File. We averaged overlap proportions on the 20 datasets for the fold-change, t-test and SAM methods. Fig 4 shows the results.

Bottom Line: Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data.To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis.Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

View Article: PubMed Central - PubMed

Affiliation: MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing, China.

ABSTRACT
The assumption that total abundance of RNAs in a cell is roughly the same in different cells is underlying most studies based on gene expression analyses. But experiments have shown that changes in the expression of some master regulators such as c-MYC can cause global shift in the expression of almost all genes in some cell types like cancers. Such shift will violate this assumption and can cause wrong or biased conclusions for standard data analysis practices, such as detection of differentially expressed (DE) genes and molecular classification of tumors based on gene expression. Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data. To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis. We collected data with known global shift effect and also generated data to simulate different situations of the effect based on a wide collection of real gene expression data, and conducted comparative studies on representative existing methods. We observed that some DE analysis methods are more tolerant to the global shift while others are very sensitive to it. Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected.

No MeSH data available.


Related in: MedlinePlus