Limits...
Evolutionary algorithms for the selection of single nucleotide polymorphisms.

Hubley RM, Zitzler E, Roach JC - BMC Bioinformatics (2003)

Bottom Line: The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci.They provide flexibility with respect to the problem formulation if a problem description evolves or changes.Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Systems Biology, Seattle, WA, USA. rhubley@systemsbiology.org

ABSTRACT

Background: Large databases of single nucleotide polymorphisms (SNPs) are available for use in genomics studies. Typically, investigators must choose a subset of SNPs from these databases to employ in their studies. The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci. We present an evolutionary algorithm for multiobjective SNP selection.

Results: We implemented a modified version of the Strength-Pareto Evolutionary Algorithm (SPEA2) in Java. Our implementation, Multiobjective Analyzer for Genetic Marker Acquisition (MAGMA), approximates the set of optimal trade-off solutions for large problems in minutes. This set is very useful for the design of large studies, including those oriented towards disease identification, genetic mapping, population studies, and haplotype-block elucidation.

Conclusion: Evolutionary algorithms are particularly suited for optimization problems that involve multiple objectives and a complex search space on which exact methods such as exhaustive enumeration cannot be applied. They provide flexibility with respect to the problem formulation if a problem description evolves or changes. Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors. MAGMA is open source and available at http://snp-magma.sourceforge.net. Evolutionary algorithms are well suited for many other applications in genomics.

Show MeSH
Enumeration Benchmark. We exhaustively enumerated all solutions for SNP selection from a library of twenty SNPs. We then tested MAGMA's ability to identify the optimal front on this same set of SNPs. Prior to 130 generations, MAGMA discovered the entire Pareto-optimal front. Enumeration required seven hours on a desktop machine; MAGMA computed the 130 generations in ninety seconds. The two objective functions, f1 (inversely proportional to the number of SNPs in the solution) and f2 ("coverage"), are expressed in arbitrary units.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC183839&req=5

Figure 5: Enumeration Benchmark. We exhaustively enumerated all solutions for SNP selection from a library of twenty SNPs. We then tested MAGMA's ability to identify the optimal front on this same set of SNPs. Prior to 130 generations, MAGMA discovered the entire Pareto-optimal front. Enumeration required seven hours on a desktop machine; MAGMA computed the 130 generations in ninety seconds. The two objective functions, f1 (inversely proportional to the number of SNPs in the solution) and f2 ("coverage"), are expressed in arbitrary units.

Mentions: The first twenty SNPs from a much larger library of SNPs derived from the MHC were chosen as a representative library. We attempted exhaustive enumeration on larger libraries but were foiled by the excessive time required for computation. For this benchmark, serial enumeration required seven hours and ten minutes; MAGMA finished its computations in one minute and twenty-eight seconds (Figure 5).


Evolutionary algorithms for the selection of single nucleotide polymorphisms.

Hubley RM, Zitzler E, Roach JC - BMC Bioinformatics (2003)

Enumeration Benchmark. We exhaustively enumerated all solutions for SNP selection from a library of twenty SNPs. We then tested MAGMA's ability to identify the optimal front on this same set of SNPs. Prior to 130 generations, MAGMA discovered the entire Pareto-optimal front. Enumeration required seven hours on a desktop machine; MAGMA computed the 130 generations in ninety seconds. The two objective functions, f1 (inversely proportional to the number of SNPs in the solution) and f2 ("coverage"), are expressed in arbitrary units.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC183839&req=5

Figure 5: Enumeration Benchmark. We exhaustively enumerated all solutions for SNP selection from a library of twenty SNPs. We then tested MAGMA's ability to identify the optimal front on this same set of SNPs. Prior to 130 generations, MAGMA discovered the entire Pareto-optimal front. Enumeration required seven hours on a desktop machine; MAGMA computed the 130 generations in ninety seconds. The two objective functions, f1 (inversely proportional to the number of SNPs in the solution) and f2 ("coverage"), are expressed in arbitrary units.
Mentions: The first twenty SNPs from a much larger library of SNPs derived from the MHC were chosen as a representative library. We attempted exhaustive enumeration on larger libraries but were foiled by the excessive time required for computation. For this benchmark, serial enumeration required seven hours and ten minutes; MAGMA finished its computations in one minute and twenty-eight seconds (Figure 5).

Bottom Line: The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci.They provide flexibility with respect to the problem formulation if a problem description evolves or changes.Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Systems Biology, Seattle, WA, USA. rhubley@systemsbiology.org

ABSTRACT

Background: Large databases of single nucleotide polymorphisms (SNPs) are available for use in genomics studies. Typically, investigators must choose a subset of SNPs from these databases to employ in their studies. The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci. We present an evolutionary algorithm for multiobjective SNP selection.

Results: We implemented a modified version of the Strength-Pareto Evolutionary Algorithm (SPEA2) in Java. Our implementation, Multiobjective Analyzer for Genetic Marker Acquisition (MAGMA), approximates the set of optimal trade-off solutions for large problems in minutes. This set is very useful for the design of large studies, including those oriented towards disease identification, genetic mapping, population studies, and haplotype-block elucidation.

Conclusion: Evolutionary algorithms are particularly suited for optimization problems that involve multiple objectives and a complex search space on which exact methods such as exhaustive enumeration cannot be applied. They provide flexibility with respect to the problem formulation if a problem description evolves or changes. Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors. MAGMA is open source and available at http://snp-magma.sourceforge.net. Evolutionary algorithms are well suited for many other applications in genomics.

Show MeSH