Limits...
Identification of Circulating Tumor DNA for the Early Detection of Small-cell Lung Cancer

View Article: PubMed Central - PubMed

ABSTRACT

Circulating tumor DNA (ctDNA) is emerging as a key potential biomarker for post-diagnosis surveillance but it may also play a crucial role in the detection of pre-clinical cancer. Small-cell lung cancer (SCLC) is an excellent candidate for early detection given there are no successful therapeutic options for late-stage disease, and it displays almost universal inactivation of TP53. We assessed the presence of TP53 mutations in the cell-free DNA (cfDNA) extracted from the plasma of 51 SCLC cases and 123 non-cancer controls. We identified mutations using a pipeline specifically designed to accurately detect variants at very low fractions. We detected TP53 mutations in the cfDNA of 49% SCLC patients and 11.4% of non-cancer controls. When stratifying the 51 initial SCLC cases by stage, TP53 mutations were detected in the cfDNA of 35.7% early-stage and 54.1% late-stage SCLC patients. The results in the controls were further replicated in 10.8% of an independent series of 102 non-cancer controls. The detection of TP53 mutations in 11% of the 225 non-cancer controls suggests that somatic mutations in cfDNA among individuals without any cancer diagnosis is a common occurrence, and poses serious challenges for the development of ctDNA screening tests.

No MeSH data available.


Related in: MedlinePlus

Characteristics of TP53 mutations in cases and controls(a) Two examples of variants called using Needlestack's regression model of sequencing error. Each dot represents a sequenced library (two dots per sample) colored according to its phred-scaled q-value. The black regression line shows the estimated sequencing-error rate along with the 99% confidence interval (black dotted lines) containing samples. Colored-dotted lines correspond to the limits of regions defined for different significance q-value thresholds. Both technical duplicates appear as outliers from the regression (in red), and are therefore classified as carrying the given mutation; (b) Percentage of TP53 mutated samples in the cfDNA of Russian cases and controls, and replication controls; (c) Distribution of TP53 mutations found in SCLC tumors (George et al., 2015) and in our series of cases and controls across the different p53 protein domains; (d) Type of mutations and functional impact of missense ones based on the IARC TP53 database: F (functional), PF (partially functional), NF (non-functional); (e) Percentage of allelic fractions of the TP53 mutations detected in the cfDNA of Russian cases and controls, and replication controls. The whiskers represent the minimum and maximum values.
© Copyright Policy - CC BY-NC-ND
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5036515&req=5

f0005: Characteristics of TP53 mutations in cases and controls(a) Two examples of variants called using Needlestack's regression model of sequencing error. Each dot represents a sequenced library (two dots per sample) colored according to its phred-scaled q-value. The black regression line shows the estimated sequencing-error rate along with the 99% confidence interval (black dotted lines) containing samples. Colored-dotted lines correspond to the limits of regions defined for different significance q-value thresholds. Both technical duplicates appear as outliers from the regression (in red), and are therefore classified as carrying the given mutation; (b) Percentage of TP53 mutated samples in the cfDNA of Russian cases and controls, and replication controls; (c) Distribution of TP53 mutations found in SCLC tumors (George et al., 2015) and in our series of cases and controls across the different p53 protein domains; (d) Type of mutations and functional impact of missense ones based on the IARC TP53 database: F (functional), PF (partially functional), NF (non-functional); (e) Percentage of allelic fractions of the TP53 mutations detected in the cfDNA of Russian cases and controls, and replication controls. The whiskers represent the minimum and maximum values.

Mentions: For the calling of variants we used Needlestack, a recently developed ultra-sensitive variant caller, which estimates the distribution of sequencing errors across multiple samples to reliably identify variants present in very low proportion (https://github.com/IARCbioinfo/needlestack) (unpublished data; Delhomme et al.). Contrary to most existing algorithms, Needlestack can deal with both single nucleotide substitutions (SNVs) and short indels. At each position and for each candidate variant, sequencing errors are modeled using a robust negative binomial regression (Aeberhard et al., 2014), with a linear link and a zero intercept. True variants are outliers from this error model (Fig. 1a). The robust estimator of the over-dispersion parameter avoids bias due to these outliers (Aeberhard et al., 2014). For each sample a p-value against the hypothesis of being a sequencing error is calculated, and further transformed into a q-value using the Benjamini and Hochberg false-discovery rate control method (Benjamini and Yosef, 1995). Q-values are reported as a Phred-scale quality score: Q = − 10 log10(q-value), and we used a threshold of Q > 50 to call variants. For each variant, we also calculated the relative variant strand bias defined by:RVSB=maxAOpDPmAOmDPpAOpDPm+AOmDPp.where DP and AO denote respectively the total number of reads and the number of reads matching the candidate variant, with the subscripts p and m referring to the forward and reverse strands respectively. In the complete absence of strand bias, RVSB = 0.5 and AOp/AOm = DPp/DPm, whereas for a completely biased variant, RVSB = 1. We filtered out variants with RVSB > 0.85.


Identification of Circulating Tumor DNA for the Early Detection of Small-cell Lung Cancer
Characteristics of TP53 mutations in cases and controls(a) Two examples of variants called using Needlestack's regression model of sequencing error. Each dot represents a sequenced library (two dots per sample) colored according to its phred-scaled q-value. The black regression line shows the estimated sequencing-error rate along with the 99% confidence interval (black dotted lines) containing samples. Colored-dotted lines correspond to the limits of regions defined for different significance q-value thresholds. Both technical duplicates appear as outliers from the regression (in red), and are therefore classified as carrying the given mutation; (b) Percentage of TP53 mutated samples in the cfDNA of Russian cases and controls, and replication controls; (c) Distribution of TP53 mutations found in SCLC tumors (George et al., 2015) and in our series of cases and controls across the different p53 protein domains; (d) Type of mutations and functional impact of missense ones based on the IARC TP53 database: F (functional), PF (partially functional), NF (non-functional); (e) Percentage of allelic fractions of the TP53 mutations detected in the cfDNA of Russian cases and controls, and replication controls. The whiskers represent the minimum and maximum values.
© Copyright Policy - CC BY-NC-ND
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5036515&req=5

f0005: Characteristics of TP53 mutations in cases and controls(a) Two examples of variants called using Needlestack's regression model of sequencing error. Each dot represents a sequenced library (two dots per sample) colored according to its phred-scaled q-value. The black regression line shows the estimated sequencing-error rate along with the 99% confidence interval (black dotted lines) containing samples. Colored-dotted lines correspond to the limits of regions defined for different significance q-value thresholds. Both technical duplicates appear as outliers from the regression (in red), and are therefore classified as carrying the given mutation; (b) Percentage of TP53 mutated samples in the cfDNA of Russian cases and controls, and replication controls; (c) Distribution of TP53 mutations found in SCLC tumors (George et al., 2015) and in our series of cases and controls across the different p53 protein domains; (d) Type of mutations and functional impact of missense ones based on the IARC TP53 database: F (functional), PF (partially functional), NF (non-functional); (e) Percentage of allelic fractions of the TP53 mutations detected in the cfDNA of Russian cases and controls, and replication controls. The whiskers represent the minimum and maximum values.
Mentions: For the calling of variants we used Needlestack, a recently developed ultra-sensitive variant caller, which estimates the distribution of sequencing errors across multiple samples to reliably identify variants present in very low proportion (https://github.com/IARCbioinfo/needlestack) (unpublished data; Delhomme et al.). Contrary to most existing algorithms, Needlestack can deal with both single nucleotide substitutions (SNVs) and short indels. At each position and for each candidate variant, sequencing errors are modeled using a robust negative binomial regression (Aeberhard et al., 2014), with a linear link and a zero intercept. True variants are outliers from this error model (Fig. 1a). The robust estimator of the over-dispersion parameter avoids bias due to these outliers (Aeberhard et al., 2014). For each sample a p-value against the hypothesis of being a sequencing error is calculated, and further transformed into a q-value using the Benjamini and Hochberg false-discovery rate control method (Benjamini and Yosef, 1995). Q-values are reported as a Phred-scale quality score: Q = − 10 log10(q-value), and we used a threshold of Q > 50 to call variants. For each variant, we also calculated the relative variant strand bias defined by:RVSB=maxAOpDPmAOmDPpAOpDPm+AOmDPp.where DP and AO denote respectively the total number of reads and the number of reads matching the candidate variant, with the subscripts p and m referring to the forward and reverse strands respectively. In the complete absence of strand bias, RVSB = 0.5 and AOp/AOm = DPp/DPm, whereas for a completely biased variant, RVSB = 1. We filtered out variants with RVSB > 0.85.

View Article: PubMed Central - PubMed

ABSTRACT

Circulating tumor DNA (ctDNA) is emerging as a key potential biomarker for post-diagnosis surveillance but it may also play a crucial role in the detection of pre-clinical cancer. Small-cell lung cancer (SCLC) is an excellent candidate for early detection given there are no successful therapeutic options for late-stage disease, and it displays almost universal inactivation of TP53. We assessed the presence of TP53 mutations in the cell-free DNA (cfDNA) extracted from the plasma of 51 SCLC cases and 123 non-cancer controls. We identified mutations using a pipeline specifically designed to accurately detect variants at very low fractions. We detected TP53 mutations in the cfDNA of 49% SCLC patients and 11.4% of non-cancer controls. When stratifying the 51 initial SCLC cases by stage, TP53 mutations were detected in the cfDNA of 35.7% early-stage and 54.1% late-stage SCLC patients. The results in the controls were further replicated in 10.8% of an independent series of 102 non-cancer controls. The detection of TP53 mutations in 11% of the 225 non-cancer controls suggests that somatic mutations in cfDNA among individuals without any cancer diagnosis is a common occurrence, and poses serious challenges for the development of ctDNA screening tests.

No MeSH data available.


Related in: MedlinePlus