Limits...
Gene-expression patterns in peripheral blood classify familial breast cancer susceptibility.

Piccolo SR, Andrulis IL, Cohen AL, Conner T, Moos PJ, Spira AE, Buys SS, Johnson WE, Bild AH - BMC Med Genomics (2015)

Bottom Line: We used the support vector machines algorithm to differentiate between patients who developed FBC and those who did not.Our Ontario cohort resulted in similarly high levels of accuracy (AUC = 0.73 [0.59-0.86]), thus providing external validation of our findings.Individuals deemed to have "high" risk by our model would have an estimated 2.4 times greater odds of developing familial breast cancer than individuals deemed to have "low" risk.

View Article: PubMed Central - PubMed

Affiliation: Department of Pharmacology and Toxicology, University of Utah, Salt Lake City, UT, USA. stephen_piccolo@byu.edu.

ABSTRACT

Background: Women with a family history of breast cancer face considerable uncertainty about whether to pursue standard screening, intensive screening, or prophylactic surgery. Accurate and individualized risk-estimation approaches may help these women make more informed decisions. Although highly penetrant genetic variants have been associated with familial breast cancer (FBC) risk, many individuals do not carry these variants, and many carriers never develop breast cancer. Common risk variants have a relatively modest effect on risk and show limited potential for predicting FBC development. As an alternative, we hypothesized that additional genomic data types, such as gene-expression levels, which can reflect genetic and epigenetic variation, could contribute to classifying a person's risk status. Specifically, we aimed to identify common patterns in gene-expression levels across individuals who develop FBC.

Methods: We profiled peripheral blood mononuclear cells from women with a family history of breast cancer (with or without a germline BRCA1/2 variant) and from controls. We used the support vector machines algorithm to differentiate between patients who developed FBC and those who did not. Our study used two independent datasets, a training set of 124 women from Utah (USA) and an external validation (test) set from Ontario (Canada) of 73 women (197 total). We controlled for expression variation associated with clinical, demographic, and treatment variables as well as lymphocyte markers.

Results: Our multigene biomarker provided accurate, individual-level estimates of FBC occurrence for the Utah cohort (AUC = 0.76 [0.67-84]) . Even at their lower confidence bounds, these accuracy estimates meet or exceed estimates from alternative approaches. Our Ontario cohort resulted in similarly high levels of accuracy (AUC = 0.73 [0.59-0.86]), thus providing external validation of our findings. Individuals deemed to have "high" risk by our model would have an estimated 2.4 times greater odds of developing familial breast cancer than individuals deemed to have "low" risk.

Conclusions: Together, these findings suggest that gene-expression levels in peripheral blood cells reflect genomic variation associated with breast cancer risk and that such data have potential to be used as a non-invasive biomarker for familial breast cancer risk.

No MeSH data available.


Related in: MedlinePlus

Predictions of familial breast cancer status in two independent cohorts. a In a cross-validated design, we predicted familial breast cancer status for 124 women from Utah. This cohort included women who did or did not have a family history (FH) of breast cancer, who did or did not carry a BRCA1 or BRCA2 mutation (BRCAX if not), and who had or had not developed breast cancer. The “Genomic model score” values represent probabilistic predictions made by the support vector machines algorithm. Higher values indicate a higher probability that a given individual had developed familial breast cancer. These scores were much higher for individuals who had a family history of breast cancer and developed a breast tumor, irrespective of BRCA1/BRCA2 mutation status. b In a training/testing design, we predicted whether individuals in the independent Ontario cohort had developed familial breast cancer. The support vector machines algorithm was trained on the full Utah data set. Again, the scores were considerably higher for women with a family history of breast cancer who had developed a breast tumor
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4634735&req=5

Fig1: Predictions of familial breast cancer status in two independent cohorts. a In a cross-validated design, we predicted familial breast cancer status for 124 women from Utah. This cohort included women who did or did not have a family history (FH) of breast cancer, who did or did not carry a BRCA1 or BRCA2 mutation (BRCAX if not), and who had or had not developed breast cancer. The “Genomic model score” values represent probabilistic predictions made by the support vector machines algorithm. Higher values indicate a higher probability that a given individual had developed familial breast cancer. These scores were much higher for individuals who had a family history of breast cancer and developed a breast tumor, irrespective of BRCA1/BRCA2 mutation status. b In a training/testing design, we predicted whether individuals in the independent Ontario cohort had developed familial breast cancer. The support vector machines algorithm was trained on the full Utah data set. Again, the scores were considerably higher for women with a family history of breast cancer who had developed a breast tumor

Mentions: We filtered the genome-wide PBMC gene-expression data by identifying genes whose expression best differentiated individuals who developed FBC from controls (see Methods). Controls were of three types: individuals with a family history of breast cancer who themselves did not develop breast cancer by age 55 or greater; individuals with no family history of breast cancer who also did not develop breast cancer; and individuals with no family history of breast cancer who did develop (sporadic) breast cancer. We then used expression values for those genes to predict FBC status for each individual using the SVM algorithm [37]. Two cohorts of samples were used for this study: a cohort from Utah and an independent cohort from Ontario; both included high-risk unaffected and affected women (see Methods for cohort details). Initially, we evaluated this approach in the Utah cohort via ten-fold cross validation. Our gene expression-based estimates of FBC development were consistently higher for women from FBC families who had developed cancer than for any subset of controls (Fig. 1a), attaining an AUC value of 0.76 (95 % CI = 0.67-0.85). Similar levels of accuracy were attained for women who carried a BRCA1/2 mutation as for women with a family history of breast cancer but with no known BRCA1/2 mutation (termed BRCAX) (Fig. 1a; Additional file 1: Table S1). Even at the lower confidence bounds, these AUC values are competitive with results observed in previous studies that used alternative approaches [7–9]. To further evaluate this result, we randomly permuted the class labels and observed that the biomarker’s accuracy was highly significant (p = 0.001). We also repeated cross-validation 1,000 times on the Utah data and observed that on average the best prediction results were attained using 250 genes; however accuracy was consistently high, independent of gene number (Fig. 2).Fig. 1


Gene-expression patterns in peripheral blood classify familial breast cancer susceptibility.

Piccolo SR, Andrulis IL, Cohen AL, Conner T, Moos PJ, Spira AE, Buys SS, Johnson WE, Bild AH - BMC Med Genomics (2015)

Predictions of familial breast cancer status in two independent cohorts. a In a cross-validated design, we predicted familial breast cancer status for 124 women from Utah. This cohort included women who did or did not have a family history (FH) of breast cancer, who did or did not carry a BRCA1 or BRCA2 mutation (BRCAX if not), and who had or had not developed breast cancer. The “Genomic model score” values represent probabilistic predictions made by the support vector machines algorithm. Higher values indicate a higher probability that a given individual had developed familial breast cancer. These scores were much higher for individuals who had a family history of breast cancer and developed a breast tumor, irrespective of BRCA1/BRCA2 mutation status. b In a training/testing design, we predicted whether individuals in the independent Ontario cohort had developed familial breast cancer. The support vector machines algorithm was trained on the full Utah data set. Again, the scores were considerably higher for women with a family history of breast cancer who had developed a breast tumor
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4634735&req=5

Fig1: Predictions of familial breast cancer status in two independent cohorts. a In a cross-validated design, we predicted familial breast cancer status for 124 women from Utah. This cohort included women who did or did not have a family history (FH) of breast cancer, who did or did not carry a BRCA1 or BRCA2 mutation (BRCAX if not), and who had or had not developed breast cancer. The “Genomic model score” values represent probabilistic predictions made by the support vector machines algorithm. Higher values indicate a higher probability that a given individual had developed familial breast cancer. These scores were much higher for individuals who had a family history of breast cancer and developed a breast tumor, irrespective of BRCA1/BRCA2 mutation status. b In a training/testing design, we predicted whether individuals in the independent Ontario cohort had developed familial breast cancer. The support vector machines algorithm was trained on the full Utah data set. Again, the scores were considerably higher for women with a family history of breast cancer who had developed a breast tumor
Mentions: We filtered the genome-wide PBMC gene-expression data by identifying genes whose expression best differentiated individuals who developed FBC from controls (see Methods). Controls were of three types: individuals with a family history of breast cancer who themselves did not develop breast cancer by age 55 or greater; individuals with no family history of breast cancer who also did not develop breast cancer; and individuals with no family history of breast cancer who did develop (sporadic) breast cancer. We then used expression values for those genes to predict FBC status for each individual using the SVM algorithm [37]. Two cohorts of samples were used for this study: a cohort from Utah and an independent cohort from Ontario; both included high-risk unaffected and affected women (see Methods for cohort details). Initially, we evaluated this approach in the Utah cohort via ten-fold cross validation. Our gene expression-based estimates of FBC development were consistently higher for women from FBC families who had developed cancer than for any subset of controls (Fig. 1a), attaining an AUC value of 0.76 (95 % CI = 0.67-0.85). Similar levels of accuracy were attained for women who carried a BRCA1/2 mutation as for women with a family history of breast cancer but with no known BRCA1/2 mutation (termed BRCAX) (Fig. 1a; Additional file 1: Table S1). Even at the lower confidence bounds, these AUC values are competitive with results observed in previous studies that used alternative approaches [7–9]. To further evaluate this result, we randomly permuted the class labels and observed that the biomarker’s accuracy was highly significant (p = 0.001). We also repeated cross-validation 1,000 times on the Utah data and observed that on average the best prediction results were attained using 250 genes; however accuracy was consistently high, independent of gene number (Fig. 2).Fig. 1

Bottom Line: We used the support vector machines algorithm to differentiate between patients who developed FBC and those who did not.Our Ontario cohort resulted in similarly high levels of accuracy (AUC = 0.73 [0.59-0.86]), thus providing external validation of our findings.Individuals deemed to have "high" risk by our model would have an estimated 2.4 times greater odds of developing familial breast cancer than individuals deemed to have "low" risk.

View Article: PubMed Central - PubMed

Affiliation: Department of Pharmacology and Toxicology, University of Utah, Salt Lake City, UT, USA. stephen_piccolo@byu.edu.

ABSTRACT

Background: Women with a family history of breast cancer face considerable uncertainty about whether to pursue standard screening, intensive screening, or prophylactic surgery. Accurate and individualized risk-estimation approaches may help these women make more informed decisions. Although highly penetrant genetic variants have been associated with familial breast cancer (FBC) risk, many individuals do not carry these variants, and many carriers never develop breast cancer. Common risk variants have a relatively modest effect on risk and show limited potential for predicting FBC development. As an alternative, we hypothesized that additional genomic data types, such as gene-expression levels, which can reflect genetic and epigenetic variation, could contribute to classifying a person's risk status. Specifically, we aimed to identify common patterns in gene-expression levels across individuals who develop FBC.

Methods: We profiled peripheral blood mononuclear cells from women with a family history of breast cancer (with or without a germline BRCA1/2 variant) and from controls. We used the support vector machines algorithm to differentiate between patients who developed FBC and those who did not. Our study used two independent datasets, a training set of 124 women from Utah (USA) and an external validation (test) set from Ontario (Canada) of 73 women (197 total). We controlled for expression variation associated with clinical, demographic, and treatment variables as well as lymphocyte markers.

Results: Our multigene biomarker provided accurate, individual-level estimates of FBC occurrence for the Utah cohort (AUC = 0.76 [0.67-84]) . Even at their lower confidence bounds, these accuracy estimates meet or exceed estimates from alternative approaches. Our Ontario cohort resulted in similarly high levels of accuracy (AUC = 0.73 [0.59-0.86]), thus providing external validation of our findings. Individuals deemed to have "high" risk by our model would have an estimated 2.4 times greater odds of developing familial breast cancer than individuals deemed to have "low" risk.

Conclusions: Together, these findings suggest that gene-expression levels in peripheral blood cells reflect genomic variation associated with breast cancer risk and that such data have potential to be used as a non-invasive biomarker for familial breast cancer risk.

No MeSH data available.


Related in: MedlinePlus