Limits...
A conditional random fields method for RNA sequence-structure relationship modeling and conformation sampling.

Wang Z, Xu J - Bioinformatics (2011)

Bottom Line: In addition, neither of these methods makes use of sequence information in sampling conformations.Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling.Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. zywang@ttic.edu; j3xu@ttic.edu Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Toyota Technological Institute at Chicago, IL, USA. zywang@ttic.edu

ABSTRACT

Unlabelled: Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence-structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE.

Contact: zywang@ttic.edu; j3xu@ttic.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH

Related in: MedlinePlus

(A) Empirical distribution density of the torsion (τ1) on the pseudo−bond C4′-−P and α. (B) Distribution density of the torsion (τ2) on the pseudo−bond P-C4′ and α. The empirical distributions are built from all representative RNA structures (see Section 2.4).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117333&req=5

Figure 3: (A) Empirical distribution density of the torsion (τ1) on the pseudo−bond C4′-−P and α. (B) Distribution density of the torsion (τ2) on the pseudo−bond P-C4′ and α. The empirical distributions are built from all representative RNA structures (see Section 2.4).

Mentions: We use a simplified representation so that we can reduce the number of torsion angles needed for the local conformation of a nucleotide (Cao and Chen, 2005; Duarte and Pyle, 1998; Hershkovitz et al., 2006; Zhang et al., 2008). In particular, we use the torsions τ1 and τ2 on pseudo-bonds P−C4′ and C4′ –P (see pink lines in Figure 1). However, to determine coordinates of the six backbone atoms of a nucleotide, we also need two planar angles θ, ψ and another torsion α on bond P−O5′. Overall, we use a five tuple (τ1, τ2, θ, ψ, α) to represent the local conformation of a nucleotide. The torsion angles are separated in several groups in the whole angle space, as shown in Figure 3. Although there are many different methods to represent an RNA conformation, this simplified representation enables us to rapidly rebuild backbone atoms from angles. Similar representations have also been extensively adopted by previous works (Cao and Chen, 2005; Duarte and Pyle, 1998; Hershkovitz et al., 2006; Zhang et al., 2008).Fig. 3.


A conditional random fields method for RNA sequence-structure relationship modeling and conformation sampling.

Wang Z, Xu J - Bioinformatics (2011)

(A) Empirical distribution density of the torsion (τ1) on the pseudo−bond C4′-−P and α. (B) Distribution density of the torsion (τ2) on the pseudo−bond P-C4′ and α. The empirical distributions are built from all representative RNA structures (see Section 2.4).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117333&req=5

Figure 3: (A) Empirical distribution density of the torsion (τ1) on the pseudo−bond C4′-−P and α. (B) Distribution density of the torsion (τ2) on the pseudo−bond P-C4′ and α. The empirical distributions are built from all representative RNA structures (see Section 2.4).
Mentions: We use a simplified representation so that we can reduce the number of torsion angles needed for the local conformation of a nucleotide (Cao and Chen, 2005; Duarte and Pyle, 1998; Hershkovitz et al., 2006; Zhang et al., 2008). In particular, we use the torsions τ1 and τ2 on pseudo-bonds P−C4′ and C4′ –P (see pink lines in Figure 1). However, to determine coordinates of the six backbone atoms of a nucleotide, we also need two planar angles θ, ψ and another torsion α on bond P−O5′. Overall, we use a five tuple (τ1, τ2, θ, ψ, α) to represent the local conformation of a nucleotide. The torsion angles are separated in several groups in the whole angle space, as shown in Figure 3. Although there are many different methods to represent an RNA conformation, this simplified representation enables us to rapidly rebuild backbone atoms from angles. Similar representations have also been extensively adopted by previous works (Cao and Chen, 2005; Duarte and Pyle, 1998; Hershkovitz et al., 2006; Zhang et al., 2008).Fig. 3.

Bottom Line: In addition, neither of these methods makes use of sequence information in sampling conformations.Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling.Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. zywang@ttic.edu; j3xu@ttic.edu Supplementary data are available at Bioinformatics online.

View Article: PubMed Central - PubMed

Affiliation: Toyota Technological Institute at Chicago, IL, USA. zywang@ttic.edu

ABSTRACT

Unlabelled: Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence-structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence-structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE.

Contact: zywang@ttic.edu; j3xu@ttic.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH
Related in: MedlinePlus