Limits...
Gene expression signature of cigarette smoking and its role in lung adenocarcinoma development and survival.

Landi MT, Dracheva T, Rotunno M, Figueroa JD, Liu H, Dasgupta A, Mann FE, Fukuoka J, Hames M, Bergen AW, Murphy SE, Yang P, Pesatori AC, Consonni D, Bertazzi PA, Wacholder S, Shih JH, Caporaso NE, Jen J - PLoS ONE (2008)

Bottom Line: ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection.NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers.

View Article: PubMed Central - PubMed

Affiliation: Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Bethesda, Maryland, USA.

ABSTRACT

Background: Tobacco smoking is responsible for over 90% of lung cancer cases, and yet the precise molecular alterations induced by smoking in lung that develop into cancer and impact survival have remained obscure.

Methodology/principal findings: We performed gene expression analysis using HG-U133A Affymetrix chips on 135 fresh frozen tissue samples of adenocarcinoma and paired noninvolved lung tissue from current, former and never smokers, with biochemically validated smoking information. ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection. Results were confirmed in independent adenocarcinoma and non-tumor tissues from two studies. We identified a gene expression signature characteristic of smoking that includes cell cycle genes, particularly those involved in the mitotic spindle formation (e.g., NEK2, TTK, PRC1). Expression of these genes strongly differentiated both smokers from non-smokers in lung tumors and early stage tumor tissue from non-tumor tissue (p<0.001 and fold-change >1.5, for each comparison), consistent with an important role for this pathway in lung carcinogenesis induced by smoking. These changes persisted many years after smoking cessation. NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.

Conclusions/significance: Our work provides insight into the smoking-related mechanisms of lung neoplasia, and shows that the very mitotic genes known to be involved in cancer development are induced by smoking and affect survival. These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers.

Show MeSH

Related in: MedlinePlus

Comparison of gene expression differentiating current from never smokers (C/N) and gene expression differentiating former from never smokers (F/N) in early stage tumor tissue (T) using Gene Set Enrichment Analysis (GSEA).Left: Running Enrichment Score (y axis) is calculated by walking down the entire list of probes from Affymetrix HG-U133A chip (numbered from 1 to 22,283 in the x axis) ordered by the ANOVA coefficients divided by the standard error values from the Former/Never (F/N) smoking comparison. This running-sum statistic increases when a given probe is in the Current/Never (C/N) Gene Set of interest and decreases when the probe is not in the C/N Gene Set, with the magnitude of increment depending on the strength of the correlation between the probe and the F/N comparison. The Enrichment Score (ES) is the maximum deviation of the Running Enrichment Score from zero encountered in the random walk and reflects the degree to which the Gene Set is overrepresented at the extremes (top or bottom) of the entire ranked probe list. We report results for two different C/N Gene Sets: on the top, the 64 up-regulated probes, with ES = 0.87 and, on the bottom, the 98 down-regulated probes, with ES = −0.90. A leading edge subset of the Gene Set is defined as those probes in the Gene Set that appear in the probes ranked list at, or before, the point where the running sum reaches its maximum deviation from zero. The leading edge for the Gene Set of the C/N up-regulated probes contains 58 probes over 64 and the leading edge for the Gene Set of down-regulated probes contains 90 over 98 probes. This confirms that among the 64 up-regulated probes from the C/N comparison, 58 are also found in the F/N comparison; and among the 98 down-regulated probes from the C/N comparison, 90 are also found in the F/N comparison. Right: distributions of ES values created using a permutation procedure for (top) the Gene Set of up-regulated probes in C/N and (bottom) the Gene Set of down-regulated probes in C/N. These distributions are used to calculate the statistical significance (nominal p-value) of the observed ES values (p-value<0.002 in both cases).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2249927&req=5

pone-0001651-g001: Comparison of gene expression differentiating current from never smokers (C/N) and gene expression differentiating former from never smokers (F/N) in early stage tumor tissue (T) using Gene Set Enrichment Analysis (GSEA).Left: Running Enrichment Score (y axis) is calculated by walking down the entire list of probes from Affymetrix HG-U133A chip (numbered from 1 to 22,283 in the x axis) ordered by the ANOVA coefficients divided by the standard error values from the Former/Never (F/N) smoking comparison. This running-sum statistic increases when a given probe is in the Current/Never (C/N) Gene Set of interest and decreases when the probe is not in the C/N Gene Set, with the magnitude of increment depending on the strength of the correlation between the probe and the F/N comparison. The Enrichment Score (ES) is the maximum deviation of the Running Enrichment Score from zero encountered in the random walk and reflects the degree to which the Gene Set is overrepresented at the extremes (top or bottom) of the entire ranked probe list. We report results for two different C/N Gene Sets: on the top, the 64 up-regulated probes, with ES = 0.87 and, on the bottom, the 98 down-regulated probes, with ES = −0.90. A leading edge subset of the Gene Set is defined as those probes in the Gene Set that appear in the probes ranked list at, or before, the point where the running sum reaches its maximum deviation from zero. The leading edge for the Gene Set of the C/N up-regulated probes contains 58 probes over 64 and the leading edge for the Gene Set of down-regulated probes contains 90 over 98 probes. This confirms that among the 64 up-regulated probes from the C/N comparison, 58 are also found in the F/N comparison; and among the 98 down-regulated probes from the C/N comparison, 90 are also found in the F/N comparison. Right: distributions of ES values created using a permutation procedure for (top) the Gene Set of up-regulated probes in C/N and (bottom) the Gene Set of down-regulated probes in C/N. These distributions are used to calculate the statistical significance (nominal p-value) of the observed ES values (p-value<0.002 in both cases).

Mentions: To verify whether the C/N smoking signature in the tumor was present also in former smokers, we compared the C/N and F/N signatures in T and found 26 probes (22 down- and 4 up-regulated, representing 21 genes) that differentiated both C/N and F/N using stringent selection criteria (Appendix S2E). Some of these genes, e.g., STOM, SSX2IP, TRPC6, APLP2 (2 probes), and DHRS7, exhibited a persistent alteration even in subjects (n = 6) who quit smoking more than 20 years before the study. The GSEA analysis showed that among the 64 up- and 98 down-regulated probes found in the C/N comparison in T, 58 and 90 probes, representing 50 up- and 73 down-regulated genes, were also up- and down-regulated, respectively in the F/N smoking comparison (p<0.001, Fig. 1, and Appendix S2F, S2G). All cell cycle genes that differentiated C/N were also altered in F/N, although less prominently (Table 2), indicating that alterations of these genes persist following smoking cessation. Importantly, the mitosis/cell cycle genes identified in C/N and F/N also differentiated the early stage tumor from the non-tumor tissue samples (T/NT, paired analysis) (Table 2), while pack years of cigarette smoking, a composite index of intensity and duration that does not consider the time when smoking occurred, were not associated with gene expression in either T or NT.


Gene expression signature of cigarette smoking and its role in lung adenocarcinoma development and survival.

Landi MT, Dracheva T, Rotunno M, Figueroa JD, Liu H, Dasgupta A, Mann FE, Fukuoka J, Hames M, Bergen AW, Murphy SE, Yang P, Pesatori AC, Consonni D, Bertazzi PA, Wacholder S, Shih JH, Caporaso NE, Jen J - PLoS ONE (2008)

Comparison of gene expression differentiating current from never smokers (C/N) and gene expression differentiating former from never smokers (F/N) in early stage tumor tissue (T) using Gene Set Enrichment Analysis (GSEA).Left: Running Enrichment Score (y axis) is calculated by walking down the entire list of probes from Affymetrix HG-U133A chip (numbered from 1 to 22,283 in the x axis) ordered by the ANOVA coefficients divided by the standard error values from the Former/Never (F/N) smoking comparison. This running-sum statistic increases when a given probe is in the Current/Never (C/N) Gene Set of interest and decreases when the probe is not in the C/N Gene Set, with the magnitude of increment depending on the strength of the correlation between the probe and the F/N comparison. The Enrichment Score (ES) is the maximum deviation of the Running Enrichment Score from zero encountered in the random walk and reflects the degree to which the Gene Set is overrepresented at the extremes (top or bottom) of the entire ranked probe list. We report results for two different C/N Gene Sets: on the top, the 64 up-regulated probes, with ES = 0.87 and, on the bottom, the 98 down-regulated probes, with ES = −0.90. A leading edge subset of the Gene Set is defined as those probes in the Gene Set that appear in the probes ranked list at, or before, the point where the running sum reaches its maximum deviation from zero. The leading edge for the Gene Set of the C/N up-regulated probes contains 58 probes over 64 and the leading edge for the Gene Set of down-regulated probes contains 90 over 98 probes. This confirms that among the 64 up-regulated probes from the C/N comparison, 58 are also found in the F/N comparison; and among the 98 down-regulated probes from the C/N comparison, 90 are also found in the F/N comparison. Right: distributions of ES values created using a permutation procedure for (top) the Gene Set of up-regulated probes in C/N and (bottom) the Gene Set of down-regulated probes in C/N. These distributions are used to calculate the statistical significance (nominal p-value) of the observed ES values (p-value<0.002 in both cases).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2249927&req=5

pone-0001651-g001: Comparison of gene expression differentiating current from never smokers (C/N) and gene expression differentiating former from never smokers (F/N) in early stage tumor tissue (T) using Gene Set Enrichment Analysis (GSEA).Left: Running Enrichment Score (y axis) is calculated by walking down the entire list of probes from Affymetrix HG-U133A chip (numbered from 1 to 22,283 in the x axis) ordered by the ANOVA coefficients divided by the standard error values from the Former/Never (F/N) smoking comparison. This running-sum statistic increases when a given probe is in the Current/Never (C/N) Gene Set of interest and decreases when the probe is not in the C/N Gene Set, with the magnitude of increment depending on the strength of the correlation between the probe and the F/N comparison. The Enrichment Score (ES) is the maximum deviation of the Running Enrichment Score from zero encountered in the random walk and reflects the degree to which the Gene Set is overrepresented at the extremes (top or bottom) of the entire ranked probe list. We report results for two different C/N Gene Sets: on the top, the 64 up-regulated probes, with ES = 0.87 and, on the bottom, the 98 down-regulated probes, with ES = −0.90. A leading edge subset of the Gene Set is defined as those probes in the Gene Set that appear in the probes ranked list at, or before, the point where the running sum reaches its maximum deviation from zero. The leading edge for the Gene Set of the C/N up-regulated probes contains 58 probes over 64 and the leading edge for the Gene Set of down-regulated probes contains 90 over 98 probes. This confirms that among the 64 up-regulated probes from the C/N comparison, 58 are also found in the F/N comparison; and among the 98 down-regulated probes from the C/N comparison, 90 are also found in the F/N comparison. Right: distributions of ES values created using a permutation procedure for (top) the Gene Set of up-regulated probes in C/N and (bottom) the Gene Set of down-regulated probes in C/N. These distributions are used to calculate the statistical significance (nominal p-value) of the observed ES values (p-value<0.002 in both cases).
Mentions: To verify whether the C/N smoking signature in the tumor was present also in former smokers, we compared the C/N and F/N signatures in T and found 26 probes (22 down- and 4 up-regulated, representing 21 genes) that differentiated both C/N and F/N using stringent selection criteria (Appendix S2E). Some of these genes, e.g., STOM, SSX2IP, TRPC6, APLP2 (2 probes), and DHRS7, exhibited a persistent alteration even in subjects (n = 6) who quit smoking more than 20 years before the study. The GSEA analysis showed that among the 64 up- and 98 down-regulated probes found in the C/N comparison in T, 58 and 90 probes, representing 50 up- and 73 down-regulated genes, were also up- and down-regulated, respectively in the F/N smoking comparison (p<0.001, Fig. 1, and Appendix S2F, S2G). All cell cycle genes that differentiated C/N were also altered in F/N, although less prominently (Table 2), indicating that alterations of these genes persist following smoking cessation. Importantly, the mitosis/cell cycle genes identified in C/N and F/N also differentiated the early stage tumor from the non-tumor tissue samples (T/NT, paired analysis) (Table 2), while pack years of cigarette smoking, a composite index of intensity and duration that does not consider the time when smoking occurred, were not associated with gene expression in either T or NT.

Bottom Line: ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection.NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers.

View Article: PubMed Central - PubMed

Affiliation: Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Bethesda, Maryland, USA.

ABSTRACT

Background: Tobacco smoking is responsible for over 90% of lung cancer cases, and yet the precise molecular alterations induced by smoking in lung that develop into cancer and impact survival have remained obscure.

Methodology/principal findings: We performed gene expression analysis using HG-U133A Affymetrix chips on 135 fresh frozen tissue samples of adenocarcinoma and paired noninvolved lung tissue from current, former and never smokers, with biochemically validated smoking information. ANOVA analysis adjusted for potential confounders, multiple testing procedure, Gene Set Enrichment Analysis, and GO-functional classification were conducted for gene selection. Results were confirmed in independent adenocarcinoma and non-tumor tissues from two studies. We identified a gene expression signature characteristic of smoking that includes cell cycle genes, particularly those involved in the mitotic spindle formation (e.g., NEK2, TTK, PRC1). Expression of these genes strongly differentiated both smokers from non-smokers in lung tumors and early stage tumor tissue from non-tumor tissue (p<0.001 and fold-change >1.5, for each comparison), consistent with an important role for this pathway in lung carcinogenesis induced by smoking. These changes persisted many years after smoking cessation. NEK2 (p<0.001) and TTK (p = 0.002) expression in the noninvolved lung tissue was also associated with a 3-fold increased risk of mortality from lung adenocarcinoma in smokers.

Conclusions/significance: Our work provides insight into the smoking-related mechanisms of lung neoplasia, and shows that the very mitotic genes known to be involved in cancer development are induced by smoking and affect survival. These genes are candidate targets for chemoprevention and treatment of lung cancer in smokers.

Show MeSH
Related in: MedlinePlus