Limits...
A conditional random fields method for RNA sequence-structure relationship modeling and conformation sampling.

Wang Z, Xu J - Bioinformatics (2011)

Bottom Line: In addition, neither of these methods makes use of sequence information in sampling conformations.Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling.Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. zywang@ttic.edu; j3xu@ttic.edu Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Toyota Technological Institute at Chicago, IL, USA. zywang@ttic.edu

ABSTRACT

Unlabelled: Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence-structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE.

Contact: zywang@ttic.edu; j3xu@ttic.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH
The 5% quantiles of the RMSD distributions for decoys sampled from the CRF models with different number of conformation states. Y-axis is the RMSD value.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117333&req=5

Figure 6: The 5% quantiles of the RMSD distributions for decoys sampled from the CRF models with different number of conformation states. Y-axis is the RMSD value.

Mentions: We also investigate the sampling performance of the CRF model with respect to the number of conformation states. We tested our CRF models with 20, 30, 50, 80 and 100 conformation states. For each CRF model, we generate 3000 decoys for each of the five RNAs: 2a43, 28sp, 2f88, 1zih and 1xjr. Figure 6 shows the 5% quantiles of the RMSD distributions for decoys generated by four different CRF models. As shown in Figure 6, the model with 50 states generates better decoys than others.Fig. 6.


A conditional random fields method for RNA sequence-structure relationship modeling and conformation sampling.

Wang Z, Xu J - Bioinformatics (2011)

The 5% quantiles of the RMSD distributions for decoys sampled from the CRF models with different number of conformation states. Y-axis is the RMSD value.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117333&req=5

Figure 6: The 5% quantiles of the RMSD distributions for decoys sampled from the CRF models with different number of conformation states. Y-axis is the RMSD value.
Mentions: We also investigate the sampling performance of the CRF model with respect to the number of conformation states. We tested our CRF models with 20, 30, 50, 80 and 100 conformation states. For each CRF model, we generate 3000 decoys for each of the five RNAs: 2a43, 28sp, 2f88, 1zih and 1xjr. Figure 6 shows the 5% quantiles of the RMSD distributions for decoys generated by four different CRF models. As shown in Figure 6, the model with 50 states generates better decoys than others.Fig. 6.

Bottom Line: In addition, neither of these methods makes use of sequence information in sampling conformations.Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling.Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. zywang@ttic.edu; j3xu@ttic.edu Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Toyota Technological Institute at Chicago, IL, USA. zywang@ttic.edu

ABSTRACT

Unlabelled: Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence-structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE.

Contact: zywang@ttic.edu; j3xu@ttic.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH