Limits...
COFOLD: an RNA secondary structure prediction method that takes co-transcriptional folding into account.

Proctor JR, Meyer IM - Nucleic Acids Res. (2013)

Bottom Line: These aim to predict the most stable RNA structure.There exists by now ample experimental and theoretical evidence that the process of structure formation matters and that sequences in vivo fold while they are being transcribed.Here, we present a conceptually new method for predicting RNA secondary structure, called CoFold, that takes effects of co-transcriptional folding explicitly into account.

View Article: PubMed Central - PubMed

Affiliation: Centre for High-Throughput Biology, University of British Columbia, 2125 East Mall, Vancouver, BC, V6T 1Z4, Canada.

ABSTRACT
Existing state-of-the-art methods that take a single RNA sequence and predict the corresponding RNA secondary structure are thermodynamic methods. These aim to predict the most stable RNA structure. There exists by now ample experimental and theoretical evidence that the process of structure formation matters and that sequences in vivo fold while they are being transcribed. None of the thermodynamic methods, however, consider the process of structure formation. Here, we present a conceptually new method for predicting RNA secondary structure, called CoFold, that takes effects of co-transcriptional folding explicitly into account. Our method significantly improves the state-of-art in terms of prediction accuracy, especially for long sequences of >1000 nt in length.

Show MeSH

Related in: MedlinePlus

Training of parameters in CoFold: linear fit and robustness. Left figure, heat-map showing the average MCC differences w.r.t. RNAfold as function of the  (x-axis) and  (y-axis) parameters values. The average MCC differences are indicated via the colours from high (bright yellow) to low (dark red), see Supplementary Figure S3 for details. The solid line corresponds to the linear regression line ( with a slope of  and an intercept of ). The two dotted lines delineate the 95% confidence region. The asterisk shows parameter pair with highest average MCC ( and ), which is the parameter combination used in CoFold and CoFold-A. Right figure, same heat-map as in left figure, but this time showing the count of trials in 20 trials of 5-fold cross-validation where that the corresponding pair of parameter values has the highest average MCC for the set of training sequences.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3643587&req=5

gkt174-F1: Training of parameters in CoFold: linear fit and robustness. Left figure, heat-map showing the average MCC differences w.r.t. RNAfold as function of the (x-axis) and (y-axis) parameters values. The average MCC differences are indicated via the colours from high (bright yellow) to low (dark red), see Supplementary Figure S3 for details. The solid line corresponds to the linear regression line ( with a slope of and an intercept of ). The two dotted lines delineate the 95% confidence region. The asterisk shows parameter pair with highest average MCC ( and ), which is the parameter combination used in CoFold and CoFold-A. Right figure, same heat-map as in left figure, but this time showing the count of trials in 20 trials of 5-fold cross-validation where that the corresponding pair of parameter values has the highest average MCC for the set of training sequences.

Mentions: Performance metrics were found to be highly correlated in and [Figure 1 (right) and Supplementary Figure S3]. To demonstrate this, linear regression was performed on the matrix [Figure 1 (left)]. We first compiled a set of triples , for which is in the 97th quantile of the performance matrix. Weighted linear regression was performed with and as dimensions and as the weight. The regression line fits the data with an value of 98.4%, indicating that variability in highly accounts for the variability in . Regression line (solid) and its 95% confidence region (dotted) are plotted in Figure 1 (left).Figure 1.


COFOLD: an RNA secondary structure prediction method that takes co-transcriptional folding into account.

Proctor JR, Meyer IM - Nucleic Acids Res. (2013)

Training of parameters in CoFold: linear fit and robustness. Left figure, heat-map showing the average MCC differences w.r.t. RNAfold as function of the  (x-axis) and  (y-axis) parameters values. The average MCC differences are indicated via the colours from high (bright yellow) to low (dark red), see Supplementary Figure S3 for details. The solid line corresponds to the linear regression line ( with a slope of  and an intercept of ). The two dotted lines delineate the 95% confidence region. The asterisk shows parameter pair with highest average MCC ( and ), which is the parameter combination used in CoFold and CoFold-A. Right figure, same heat-map as in left figure, but this time showing the count of trials in 20 trials of 5-fold cross-validation where that the corresponding pair of parameter values has the highest average MCC for the set of training sequences.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3643587&req=5

gkt174-F1: Training of parameters in CoFold: linear fit and robustness. Left figure, heat-map showing the average MCC differences w.r.t. RNAfold as function of the (x-axis) and (y-axis) parameters values. The average MCC differences are indicated via the colours from high (bright yellow) to low (dark red), see Supplementary Figure S3 for details. The solid line corresponds to the linear regression line ( with a slope of and an intercept of ). The two dotted lines delineate the 95% confidence region. The asterisk shows parameter pair with highest average MCC ( and ), which is the parameter combination used in CoFold and CoFold-A. Right figure, same heat-map as in left figure, but this time showing the count of trials in 20 trials of 5-fold cross-validation where that the corresponding pair of parameter values has the highest average MCC for the set of training sequences.
Mentions: Performance metrics were found to be highly correlated in and [Figure 1 (right) and Supplementary Figure S3]. To demonstrate this, linear regression was performed on the matrix [Figure 1 (left)]. We first compiled a set of triples , for which is in the 97th quantile of the performance matrix. Weighted linear regression was performed with and as dimensions and as the weight. The regression line fits the data with an value of 98.4%, indicating that variability in highly accounts for the variability in . Regression line (solid) and its 95% confidence region (dotted) are plotted in Figure 1 (left).Figure 1.

Bottom Line: These aim to predict the most stable RNA structure.There exists by now ample experimental and theoretical evidence that the process of structure formation matters and that sequences in vivo fold while they are being transcribed.Here, we present a conceptually new method for predicting RNA secondary structure, called CoFold, that takes effects of co-transcriptional folding explicitly into account.

View Article: PubMed Central - PubMed

Affiliation: Centre for High-Throughput Biology, University of British Columbia, 2125 East Mall, Vancouver, BC, V6T 1Z4, Canada.

ABSTRACT
Existing state-of-the-art methods that take a single RNA sequence and predict the corresponding RNA secondary structure are thermodynamic methods. These aim to predict the most stable RNA structure. There exists by now ample experimental and theoretical evidence that the process of structure formation matters and that sequences in vivo fold while they are being transcribed. None of the thermodynamic methods, however, consider the process of structure formation. Here, we present a conceptually new method for predicting RNA secondary structure, called CoFold, that takes effects of co-transcriptional folding explicitly into account. Our method significantly improves the state-of-art in terms of prediction accuracy, especially for long sequences of >1000 nt in length.

Show MeSH
Related in: MedlinePlus