Limits...
Evolutionary algorithms for the selection of single nucleotide polymorphisms.

Hubley RM, Zitzler E, Roach JC - BMC Bioinformatics (2003)

Bottom Line: The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci.They provide flexibility with respect to the problem formulation if a problem description evolves or changes.Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Systems Biology, Seattle, WA, USA. rhubley@systemsbiology.org

ABSTRACT

Background: Large databases of single nucleotide polymorphisms (SNPs) are available for use in genomics studies. Typically, investigators must choose a subset of SNPs from these databases to employ in their studies. The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci. We present an evolutionary algorithm for multiobjective SNP selection.

Results: We implemented a modified version of the Strength-Pareto Evolutionary Algorithm (SPEA2) in Java. Our implementation, Multiobjective Analyzer for Genetic Marker Acquisition (MAGMA), approximates the set of optimal trade-off solutions for large problems in minutes. This set is very useful for the design of large studies, including those oriented towards disease identification, genetic mapping, population studies, and haplotype-block elucidation.

Conclusion: Evolutionary algorithms are particularly suited for optimization problems that involve multiple objectives and a complex search space on which exact methods such as exhaustive enumeration cannot be applied. They provide flexibility with respect to the problem formulation if a problem description evolves or changes. Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors. MAGMA is open source and available at http://snp-magma.sourceforge.net. Evolutionary algorithms are well suited for many other applications in genomics.

Show MeSH
Pareto-Optimal Front. The two axes of the objectives form the objective space, which is two dimensional for the problems described in this paper. The positions of solutions on the Pareto-optimal front are shown in blue, connected by dashed pink. Each Pareto-optimal solution dominates all solutions to its lower left; for example, point A dominates points B-E, as bounded by the dashed black lines. Non-dominated solutions are shown as turquoise circles; dominated solutions are shown as olive circles. Strength is shown adjacent to each solution. Raw fitness is shown in parentheses. Because no raw fitnesses are identical, other than those on the front, density would have no impact on survival for this population as long as the archive size was at least five. Diamonds indicate positions in objective space of solutions not yet discovered by the algorithm, but that would be non-dominated once discovered.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC183839&req=5

Figure 3: Pareto-Optimal Front. The two axes of the objectives form the objective space, which is two dimensional for the problems described in this paper. The positions of solutions on the Pareto-optimal front are shown in blue, connected by dashed pink. Each Pareto-optimal solution dominates all solutions to its lower left; for example, point A dominates points B-E, as bounded by the dashed black lines. Non-dominated solutions are shown as turquoise circles; dominated solutions are shown as olive circles. Strength is shown adjacent to each solution. Raw fitness is shown in parentheses. Because no raw fitnesses are identical, other than those on the front, density would have no impact on survival for this population as long as the archive size was at least five. Diamonds indicate positions in objective space of solutions not yet discovered by the algorithm, but that would be non-dominated once discovered.

Mentions: Solutions to multiobjective problems are seldom unique. For example, if the two objective functions relate respectively to cost and performance, then the set of solutions will contain solutions with increasing performance and increasing cost. No solution, however, will have less performance for more cost. This set of alternative, optimal solutions, also known as the Pareto-optimal set, is thus a trade-off front (Figure 3). Typically, the user of the algorithm will examine the trade-off front produced by the algorithm and subjectively choose a single solution. This solution could be the best solution below a fixed cost, but often is chosen at a point of diminishing returns. The ability to identify points of diminishing returns is a major advantage to multiobjective optimization. Combining multiple objectives into a single objective in order to simplify a problem forces a decision to be made about the relative values of the different objectives before the trade-off front may be known.


Evolutionary algorithms for the selection of single nucleotide polymorphisms.

Hubley RM, Zitzler E, Roach JC - BMC Bioinformatics (2003)

Pareto-Optimal Front. The two axes of the objectives form the objective space, which is two dimensional for the problems described in this paper. The positions of solutions on the Pareto-optimal front are shown in blue, connected by dashed pink. Each Pareto-optimal solution dominates all solutions to its lower left; for example, point A dominates points B-E, as bounded by the dashed black lines. Non-dominated solutions are shown as turquoise circles; dominated solutions are shown as olive circles. Strength is shown adjacent to each solution. Raw fitness is shown in parentheses. Because no raw fitnesses are identical, other than those on the front, density would have no impact on survival for this population as long as the archive size was at least five. Diamonds indicate positions in objective space of solutions not yet discovered by the algorithm, but that would be non-dominated once discovered.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC183839&req=5

Figure 3: Pareto-Optimal Front. The two axes of the objectives form the objective space, which is two dimensional for the problems described in this paper. The positions of solutions on the Pareto-optimal front are shown in blue, connected by dashed pink. Each Pareto-optimal solution dominates all solutions to its lower left; for example, point A dominates points B-E, as bounded by the dashed black lines. Non-dominated solutions are shown as turquoise circles; dominated solutions are shown as olive circles. Strength is shown adjacent to each solution. Raw fitness is shown in parentheses. Because no raw fitnesses are identical, other than those on the front, density would have no impact on survival for this population as long as the archive size was at least five. Diamonds indicate positions in objective space of solutions not yet discovered by the algorithm, but that would be non-dominated once discovered.
Mentions: Solutions to multiobjective problems are seldom unique. For example, if the two objective functions relate respectively to cost and performance, then the set of solutions will contain solutions with increasing performance and increasing cost. No solution, however, will have less performance for more cost. This set of alternative, optimal solutions, also known as the Pareto-optimal set, is thus a trade-off front (Figure 3). Typically, the user of the algorithm will examine the trade-off front produced by the algorithm and subjectively choose a single solution. This solution could be the best solution below a fixed cost, but often is chosen at a point of diminishing returns. The ability to identify points of diminishing returns is a major advantage to multiobjective optimization. Combining multiple objectives into a single objective in order to simplify a problem forces a decision to be made about the relative values of the different objectives before the trade-off front may be known.

Bottom Line: The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci.They provide flexibility with respect to the problem formulation if a problem description evolves or changes.Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Systems Biology, Seattle, WA, USA. rhubley@systemsbiology.org

ABSTRACT

Background: Large databases of single nucleotide polymorphisms (SNPs) are available for use in genomics studies. Typically, investigators must choose a subset of SNPs from these databases to employ in their studies. The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci. We present an evolutionary algorithm for multiobjective SNP selection.

Results: We implemented a modified version of the Strength-Pareto Evolutionary Algorithm (SPEA2) in Java. Our implementation, Multiobjective Analyzer for Genetic Marker Acquisition (MAGMA), approximates the set of optimal trade-off solutions for large problems in minutes. This set is very useful for the design of large studies, including those oriented towards disease identification, genetic mapping, population studies, and haplotype-block elucidation.

Conclusion: Evolutionary algorithms are particularly suited for optimization problems that involve multiple objectives and a complex search space on which exact methods such as exhaustive enumeration cannot be applied. They provide flexibility with respect to the problem formulation if a problem description evolves or changes. Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors. MAGMA is open source and available at http://snp-magma.sourceforge.net. Evolutionary algorithms are well suited for many other applications in genomics.

Show MeSH