Limits...
Identification of genetic markers with synergistic survival effect in cancer.

Louhimo R, Laakso M, Heikkinen T, Laitinen S, Manninen P, Rogojin V, Miettinen M, Blomqvist C, Liu J, Nevanlinna H, Hautaniemi S - BMC Syst Biol (2013)

Bottom Line: The identification of synergetic functioning SNPs on genome-scale is a computationally daunting task and requires advanced algorithms.We introduce a novel algorithm, Geninter, to identify SNPs that have synergetic effect on survival of cancer patients.Our results show that Geninter outperforms the logrank test and is able to identify SNP-pairs with synergetic impact on survival.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: Cancers are complex diseases arising from accumulated genetic mutations that disrupt intracellular signaling networks. While several predisposing genetic mutations have been found, these individual mutations account only for a small fraction of cancer incidence and mortality. With large-scale measurement technologies, such as single nucleotide polymorphism (SNP) microarrays, it is now possible to identify combinatorial effects that have significant impact on cancer patient survival.

Results: The identification of synergetic functioning SNPs on genome-scale is a computationally daunting task and requires advanced algorithms. We introduce a novel algorithm, Geninter, to identify SNPs that have synergetic effect on survival of cancer patients. Using a large breast cancer cohort we generate a simulator that allows assessing reliability and accuracy of Geninter and logrank test, which is a standard statistical method to integrate genetic and survival data.

Conclusions: Our results show that Geninter outperforms the logrank test and is able to identify SNP-pairs with synergetic impact on survival.

Show MeSH

Related in: MedlinePlus

Effect of population size on rank distribution. 780 marker combinations have been evaluated for each distribution. The black dashed curve is the hypothetical  distribution. The boxes on the right of each set of curves indicate the ratio of affected and non-affected markers for each curve. If the ratio is 0.5, every marker combination has some induced survival effect. In the left panel population size is 10,000, in the middle panel 1,000, and in the right panel 100.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3750540&req=5

Figure 4: Effect of population size on rank distribution. 780 marker combinations have been evaluated for each distribution. The black dashed curve is the hypothetical distribution. The boxes on the right of each set of curves indicate the ratio of affected and non-affected markers for each curve. If the ratio is 0.5, every marker combination has some induced survival effect. In the left panel population size is 10,000, in the middle panel 1,000, and in the right panel 100.

Mentions: Logrank test is a well-established statistical method to associate a SNP to survival. We tested both Geninter and logrank test with the simulated data in which the ground truth is known. Based on simulations on the effect of of population size on rank distribution (Figure 4), we estimated the background rank distribution from a simulated cohort of 1,000 samples and used the estimated distribution to compute p-values for the ranks. We applied the false discovery rate (FDR) procedure for the multiple hypothesis correction of the p-values [21]. We verified that the simulated distribution is similar to one calculated from a larger run with real data (data not shown). We further varied the size of our marker set between 40 and 140 markers. The number of marker combinations in the simulation was restricted to 140 because the analysis of 10,000 combinations does not yet require a high-performance cluster. Our simulator allows controlling the true positives, i.e., the marker pairs whose survival times were drawn from the logarithmic distribution.


Identification of genetic markers with synergistic survival effect in cancer.

Louhimo R, Laakso M, Heikkinen T, Laitinen S, Manninen P, Rogojin V, Miettinen M, Blomqvist C, Liu J, Nevanlinna H, Hautaniemi S - BMC Syst Biol (2013)

Effect of population size on rank distribution. 780 marker combinations have been evaluated for each distribution. The black dashed curve is the hypothetical  distribution. The boxes on the right of each set of curves indicate the ratio of affected and non-affected markers for each curve. If the ratio is 0.5, every marker combination has some induced survival effect. In the left panel population size is 10,000, in the middle panel 1,000, and in the right panel 100.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3750540&req=5

Figure 4: Effect of population size on rank distribution. 780 marker combinations have been evaluated for each distribution. The black dashed curve is the hypothetical distribution. The boxes on the right of each set of curves indicate the ratio of affected and non-affected markers for each curve. If the ratio is 0.5, every marker combination has some induced survival effect. In the left panel population size is 10,000, in the middle panel 1,000, and in the right panel 100.
Mentions: Logrank test is a well-established statistical method to associate a SNP to survival. We tested both Geninter and logrank test with the simulated data in which the ground truth is known. Based on simulations on the effect of of population size on rank distribution (Figure 4), we estimated the background rank distribution from a simulated cohort of 1,000 samples and used the estimated distribution to compute p-values for the ranks. We applied the false discovery rate (FDR) procedure for the multiple hypothesis correction of the p-values [21]. We verified that the simulated distribution is similar to one calculated from a larger run with real data (data not shown). We further varied the size of our marker set between 40 and 140 markers. The number of marker combinations in the simulation was restricted to 140 because the analysis of 10,000 combinations does not yet require a high-performance cluster. Our simulator allows controlling the true positives, i.e., the marker pairs whose survival times were drawn from the logarithmic distribution.

Bottom Line: The identification of synergetic functioning SNPs on genome-scale is a computationally daunting task and requires advanced algorithms.We introduce a novel algorithm, Geninter, to identify SNPs that have synergetic effect on survival of cancer patients.Our results show that Geninter outperforms the logrank test and is able to identify SNP-pairs with synergetic impact on survival.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: Cancers are complex diseases arising from accumulated genetic mutations that disrupt intracellular signaling networks. While several predisposing genetic mutations have been found, these individual mutations account only for a small fraction of cancer incidence and mortality. With large-scale measurement technologies, such as single nucleotide polymorphism (SNP) microarrays, it is now possible to identify combinatorial effects that have significant impact on cancer patient survival.

Results: The identification of synergetic functioning SNPs on genome-scale is a computationally daunting task and requires advanced algorithms. We introduce a novel algorithm, Geninter, to identify SNPs that have synergetic effect on survival of cancer patients. Using a large breast cancer cohort we generate a simulator that allows assessing reliability and accuracy of Geninter and logrank test, which is a standard statistical method to integrate genetic and survival data.

Conclusions: Our results show that Geninter outperforms the logrank test and is able to identify SNP-pairs with synergetic impact on survival.

Show MeSH
Related in: MedlinePlus