Limits...
Computational RNA secondary structure design: empirical complexity and improved methods.

Aguirre-Hernández R, Hoos HH, Condon A - BMC Bioinformatics (2007)

Bottom Line: Such understanding provides the basis for improving the performance of one of the best algorithms for this problem, RNA-SSD, and for characterising its limitations.We also found that the algorithms are in general faster when constraints are placed only on paired bases in the structure.Our analysis helps to better understand the strengths and limitations of both the RNA-SSD and RNAinverse algorithms, and suggests ways in which the performance of these algorithms can be further improved.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Applied Mathematics, University of British Columbia, Vancouver, BC, Canada. rosalia@cs.ubc.ca <rosalia@cs.ubc.ca>

ABSTRACT

Background: We investigate the empirical complexity of the RNA secondary structure design problem, that is, the scaling of the typical difficulty of the design task for various classes of RNA structures as the size of the target structure is increased. The purpose of this work is to understand better the factors that make RNA structures hard to design for existing, high-performance algorithms. Such understanding provides the basis for improving the performance of one of the best algorithms for this problem, RNA-SSD, and for characterising its limitations.

Results: To gain insights into the practical complexity of the problem, we present a scaling analysis on random and biologically motivated structures using an improved version of the RNA-SSD algorithm, and also the RNAinverse algorithm from the Vienna package. Since primary structure constraints are relevant for designing RNA structures, we also investigate the correlation between the number and the location of the primary structure constraints when designing structures and the performance of the RNA-SSD algorithm. The scaling analysis on random and biologically motivated structures supports the hypothesis that the running time of both algorithms scales polynomially with the size of the structure. We also found that the algorithms are in general faster when constraints are placed only on paired bases in the structure. Furthermore, we prove that, according to the standard thermodynamic model, for some structures that the RNA-SSD algorithm was unable to design, there exists no sequence whose minimum free energy structure is the target structure.

Conclusion: Our analysis helps to better understand the strengths and limitations of both the RNA-SSD and RNAinverse algorithms, and suggests ways in which the performance of these algorithms can be further improved.

Show MeSH
Biologically motivated structure Bio-150-n14. Biologically motivated structure with ten stems. When constraining the bases in stems 7 and 8, this structure is hard to design. The structure motif formed by these stems, which are short and separated by a bulge, is unstable.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1808480&req=5

Figure 10: Biologically motivated structure Bio-150-n14. Biologically motivated structure with ten stems. When constraining the bases in stems 7 and 8, this structure is hard to design. The structure motif formed by these stems, which are short and separated by a bulge, is unstable.

Mentions: Our study also sheds light on the hardness of designing structures with primary structure constraints. In particular, our detailed analysis of primary structure (that is, base sequence) constraints on the performance of RNA-SSD suggests that it is generally easier to design a structure when the stems are constrained. This is intuitively plausible, given that generally, stems represent the most stable parts of RNA secondary structures. However, there are exceptions: structure Bio-150-nl4 (shown in Figure 10) was found to be difficult when more than seven stems of its ten stems were constrained. When we further analyzed the correlation between the constrained stems and the expected run time required for designing this structure, we found that constraining two particular stems, labelled 7 and 8 in Figure 10, made the design problem significantly more difficult. These short stems are separated by a bulge of size one, and they are not stable enough to compensate for the penalty incurred by the bulge. Figure 10 shows a difficult problem instance where these two stems are constrained. The structure is easy to design if the base pairs A•U in stems 7 and 8 are replaced by C•G, which is a more stable interaction.


Computational RNA secondary structure design: empirical complexity and improved methods.

Aguirre-Hernández R, Hoos HH, Condon A - BMC Bioinformatics (2007)

Biologically motivated structure Bio-150-n14. Biologically motivated structure with ten stems. When constraining the bases in stems 7 and 8, this structure is hard to design. The structure motif formed by these stems, which are short and separated by a bulge, is unstable.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1808480&req=5

Figure 10: Biologically motivated structure Bio-150-n14. Biologically motivated structure with ten stems. When constraining the bases in stems 7 and 8, this structure is hard to design. The structure motif formed by these stems, which are short and separated by a bulge, is unstable.
Mentions: Our study also sheds light on the hardness of designing structures with primary structure constraints. In particular, our detailed analysis of primary structure (that is, base sequence) constraints on the performance of RNA-SSD suggests that it is generally easier to design a structure when the stems are constrained. This is intuitively plausible, given that generally, stems represent the most stable parts of RNA secondary structures. However, there are exceptions: structure Bio-150-nl4 (shown in Figure 10) was found to be difficult when more than seven stems of its ten stems were constrained. When we further analyzed the correlation between the constrained stems and the expected run time required for designing this structure, we found that constraining two particular stems, labelled 7 and 8 in Figure 10, made the design problem significantly more difficult. These short stems are separated by a bulge of size one, and they are not stable enough to compensate for the penalty incurred by the bulge. Figure 10 shows a difficult problem instance where these two stems are constrained. The structure is easy to design if the base pairs A•U in stems 7 and 8 are replaced by C•G, which is a more stable interaction.

Bottom Line: Such understanding provides the basis for improving the performance of one of the best algorithms for this problem, RNA-SSD, and for characterising its limitations.We also found that the algorithms are in general faster when constraints are placed only on paired bases in the structure.Our analysis helps to better understand the strengths and limitations of both the RNA-SSD and RNAinverse algorithms, and suggests ways in which the performance of these algorithms can be further improved.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Applied Mathematics, University of British Columbia, Vancouver, BC, Canada. rosalia@cs.ubc.ca <rosalia@cs.ubc.ca>

ABSTRACT

Background: We investigate the empirical complexity of the RNA secondary structure design problem, that is, the scaling of the typical difficulty of the design task for various classes of RNA structures as the size of the target structure is increased. The purpose of this work is to understand better the factors that make RNA structures hard to design for existing, high-performance algorithms. Such understanding provides the basis for improving the performance of one of the best algorithms for this problem, RNA-SSD, and for characterising its limitations.

Results: To gain insights into the practical complexity of the problem, we present a scaling analysis on random and biologically motivated structures using an improved version of the RNA-SSD algorithm, and also the RNAinverse algorithm from the Vienna package. Since primary structure constraints are relevant for designing RNA structures, we also investigate the correlation between the number and the location of the primary structure constraints when designing structures and the performance of the RNA-SSD algorithm. The scaling analysis on random and biologically motivated structures supports the hypothesis that the running time of both algorithms scales polynomially with the size of the structure. We also found that the algorithms are in general faster when constraints are placed only on paired bases in the structure. Furthermore, we prove that, according to the standard thermodynamic model, for some structures that the RNA-SSD algorithm was unable to design, there exists no sequence whose minimum free energy structure is the target structure.

Conclusion: Our analysis helps to better understand the strengths and limitations of both the RNA-SSD and RNAinverse algorithms, and suggests ways in which the performance of these algorithms can be further improved.

Show MeSH