Limits...
A conditional random fields method for RNA sequence-structure relationship modeling and conformation sampling.

Wang Z, Xu J - Bioinformatics (2011)

Bottom Line: In addition, neither of these methods makes use of sequence information in sampling conformations.Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling.Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. zywang@ttic.edu; j3xu@ttic.edu Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Toyota Technological Institute at Chicago, IL, USA. zywang@ttic.edu

ABSTRACT

Unlabelled: Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence-structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE.

Contact: zywang@ttic.edu; j3xu@ttic.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH
Correlation between the local RMSD at each position and the global RMSD. The X-axis is the start position of a segment.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117333&req=5

Figure 9: Correlation between the local RMSD at each position and the global RMSD. The X-axis is the start position of a segment.

Mentions: Sampling real-valued angles generates better decoys: in order to show the detailed difference between our TreeFolder and FARNA, we look into the decoys of 1esy. We choose it because that FARNA and TreeFolder yield the largest difference on this RNA among all the 11 tested RNA molecules. As shown in Figure 8. TreeFolder can generate a much larger percentage of decoys with RMSD <4 Å than FARNA. We also compute local RMSD of each position in the decoys, which is defined as the RMSD of the segment of four consecutive nucleotides starting with this position, as compared to the native structure. We calculate the correlation between the local RMSD of each position with the global RMSD, as shown in Figure 9. Among the decoys generated by both FARNA and TreeFolder, the local RMSD at position 13 has the highest correlation with the global RMSD. We also calculate the angle error at each position by Error=‖v−v0‖2 , where v is the angle vector of a decoy at one position and v0 is the native angle vector at the same position.Fig. 8.


A conditional random fields method for RNA sequence-structure relationship modeling and conformation sampling.

Wang Z, Xu J - Bioinformatics (2011)

Correlation between the local RMSD at each position and the global RMSD. The X-axis is the start position of a segment.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117333&req=5

Figure 9: Correlation between the local RMSD at each position and the global RMSD. The X-axis is the start position of a segment.
Mentions: Sampling real-valued angles generates better decoys: in order to show the detailed difference between our TreeFolder and FARNA, we look into the decoys of 1esy. We choose it because that FARNA and TreeFolder yield the largest difference on this RNA among all the 11 tested RNA molecules. As shown in Figure 8. TreeFolder can generate a much larger percentage of decoys with RMSD <4 Å than FARNA. We also compute local RMSD of each position in the decoys, which is defined as the RMSD of the segment of four consecutive nucleotides starting with this position, as compared to the native structure. We calculate the correlation between the local RMSD of each position with the global RMSD, as shown in Figure 9. Among the decoys generated by both FARNA and TreeFolder, the local RMSD at position 13 has the highest correlation with the global RMSD. We also calculate the angle error at each position by Error=‖v−v0‖2 , where v is the angle vector of a decoy at one position and v0 is the native angle vector at the same position.Fig. 8.

Bottom Line: In addition, neither of these methods makes use of sequence information in sampling conformations.Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling.Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. zywang@ttic.edu; j3xu@ttic.edu Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Toyota Technological Institute at Chicago, IL, USA. zywang@ttic.edu

ABSTRACT

Unlabelled: Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence-structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE.

Contact: zywang@ttic.edu; j3xu@ttic.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH