Limits...
Identifying and removing the cell-cycle effect from single-cell RNA-Sequencing data

View Article: PubMed Central - PubMed

ABSTRACT

Single-cell RNA-Sequencing (scRNA-Seq) is a revolutionary technique for discovering and describing cell types in heterogeneous tissues, yet its measurement of expression often suffers from large systematic bias. A major source of this bias is the cell cycle, which introduces large within-cell-type heterogeneity that can obscure the differences in expression between cell types. The current method for removing the cell-cycle effect is unable to effectively identify this effect and has a high risk of removing other biological components of interest, compromising downstream analysis. We present ccRemover, a new method that reliably identifies the cell-cycle effect and removes it. ccRemover preserves other biological signals of interest in the data and thus can serve as an important pre-processing step for many scRNA-Seq data analyses. The effectiveness of ccRemover is demonstrated using simulation data and three real scRNA-Seq datasets, where it boosts the performance of existing clustering algorithms in distinguishing between cell types.

No MeSH data available.


Related in: MedlinePlus

Heat maps of gene expression in the lung adenocarcinoma dataset.The cell-cycle genes were chosen from the top ranked cell-cycle genes on Cyclebase and are ordered by their cell-cycle peak time. The cells were ordered based on a hierarchical clustering of the original data and the order is the same for each heat map. (a) Original Data. The blocks of similar expression indicate cells at a similar cell-cycle time point, indicating the presence of cell-cycle effects. (b) scLVM corrected data. The blocks of similar expression have been reduced but are still apparent. The color of the heat map is more balanced as the range of the expression levels is reduced after they have been corrected. (c) ccRemover corrected data. The obvious blocks have been removed from the corrected dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5037372&req=5

f5: Heat maps of gene expression in the lung adenocarcinoma dataset.The cell-cycle genes were chosen from the top ranked cell-cycle genes on Cyclebase and are ordered by their cell-cycle peak time. The cells were ordered based on a hierarchical clustering of the original data and the order is the same for each heat map. (a) Original Data. The blocks of similar expression indicate cells at a similar cell-cycle time point, indicating the presence of cell-cycle effects. (b) scLVM corrected data. The blocks of similar expression have been reduced but are still apparent. The color of the heat map is more balanced as the range of the expression levels is reduced after they have been corrected. (c) ccRemover corrected data. The obvious blocks have been removed from the corrected dataset.

Mentions: Further analysis was carried out to determine if this is the case. Figure 5 displays heat maps of the expression of the top ranked cell-cycle genes from Cyclebase43. The cell-cycle genes displayed in the heat map are ordered based on the time point of the cell cycle at which their expression peaks. If the cell-cycle effect exists, there should be blocks of similar expression levels, and these blocks should not occupy from the first row to the last row as the genes do not achieve their peak expressions at the same time point of the cell cycle. On the original data (Fig. 5a), there are clear such blocks, and the most prominent one is shown in a blue box. For the scLVM corrected data the blocks are less apparent but still present (Fig. 5b), indicating that the cell-cycle effect has been removed partially. For the ccRemover corrected data (Fig. 5c), there are no easily visible blocks left indicating that ccRemover has effectively removed the cell-cycle effect from this dataset. For both the scLVM and ccRemover corrected data the range of expression for the cell-cycle genes is reduced and so the heat map colors show less variation.


Identifying and removing the cell-cycle effect from single-cell RNA-Sequencing data
Heat maps of gene expression in the lung adenocarcinoma dataset.The cell-cycle genes were chosen from the top ranked cell-cycle genes on Cyclebase and are ordered by their cell-cycle peak time. The cells were ordered based on a hierarchical clustering of the original data and the order is the same for each heat map. (a) Original Data. The blocks of similar expression indicate cells at a similar cell-cycle time point, indicating the presence of cell-cycle effects. (b) scLVM corrected data. The blocks of similar expression have been reduced but are still apparent. The color of the heat map is more balanced as the range of the expression levels is reduced after they have been corrected. (c) ccRemover corrected data. The obvious blocks have been removed from the corrected dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5037372&req=5

f5: Heat maps of gene expression in the lung adenocarcinoma dataset.The cell-cycle genes were chosen from the top ranked cell-cycle genes on Cyclebase and are ordered by their cell-cycle peak time. The cells were ordered based on a hierarchical clustering of the original data and the order is the same for each heat map. (a) Original Data. The blocks of similar expression indicate cells at a similar cell-cycle time point, indicating the presence of cell-cycle effects. (b) scLVM corrected data. The blocks of similar expression have been reduced but are still apparent. The color of the heat map is more balanced as the range of the expression levels is reduced after they have been corrected. (c) ccRemover corrected data. The obvious blocks have been removed from the corrected dataset.
Mentions: Further analysis was carried out to determine if this is the case. Figure 5 displays heat maps of the expression of the top ranked cell-cycle genes from Cyclebase43. The cell-cycle genes displayed in the heat map are ordered based on the time point of the cell cycle at which their expression peaks. If the cell-cycle effect exists, there should be blocks of similar expression levels, and these blocks should not occupy from the first row to the last row as the genes do not achieve their peak expressions at the same time point of the cell cycle. On the original data (Fig. 5a), there are clear such blocks, and the most prominent one is shown in a blue box. For the scLVM corrected data the blocks are less apparent but still present (Fig. 5b), indicating that the cell-cycle effect has been removed partially. For the ccRemover corrected data (Fig. 5c), there are no easily visible blocks left indicating that ccRemover has effectively removed the cell-cycle effect from this dataset. For both the scLVM and ccRemover corrected data the range of expression for the cell-cycle genes is reduced and so the heat map colors show less variation.

View Article: PubMed Central - PubMed

ABSTRACT

Single-cell RNA-Sequencing (scRNA-Seq) is a revolutionary technique for discovering and describing cell types in heterogeneous tissues, yet its measurement of expression often suffers from large systematic bias. A major source of this bias is the cell cycle, which introduces large within-cell-type heterogeneity that can obscure the differences in expression between cell types. The current method for removing the cell-cycle effect is unable to effectively identify this effect and has a high risk of removing other biological components of interest, compromising downstream analysis. We present ccRemover, a new method that reliably identifies the cell-cycle effect and removes it. ccRemover preserves other biological signals of interest in the data and thus can serve as an important pre-processing step for many scRNA-Seq data analyses. The effectiveness of ccRemover is demonstrated using simulation data and three real scRNA-Seq datasets, where it boosts the performance of existing clustering algorithms in distinguishing between cell types.

No MeSH data available.


Related in: MedlinePlus