Limits...
Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework.

Katoh K, Toh H - BMC Bioinformatics (2008)

Bottom Line: In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized.The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability.The source code is available at the webpage found in the Availability and requirements section.

View Article: PubMed Central - HTML - PubMed

Affiliation: Digital Medicine Initiative, Kyushu University, Fukuoka, 812-8582, Japan. katoh@bioreg.kyushu-u.ac.jp

ABSTRACT

Background: Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable improvement in the accuracy of pairwise structural alignment. In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized.

Results: We took a different approach from a straightforward extension of the Sankoff algorithm to the multiple alignments from the viewpoints of accuracy and time complexity. As a new option of the MAFFT alignment program, we developed a multiple RNA alignment framework, X-INS-i, which builds a multiple alignment with an iterative method incorporating structural information through two components: (1) pairwise structural alignments by an external pairwise alignment method such as SCARNA or LaRA and (2) a new objective function, Four-way Consistency, derived from the base-pairing probability of every sub-aligned group at every multiple alignment stage.

Conclusion: The BRAliBASE benchmark showed that X-INS-i outperforms other methods currently available in the sum-of-pairs score (SPS) criterion. As a basis for predicting common secondary structure, the accuracy of the present method is comparable to or rather higher than those of the current leading methods such as RNA Sampler. The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability. The source code is available at the webpage found in the Availability and requirements section.

Show MeSH

Related in: MedlinePlus

A flowchart of benchmarks using the KKA dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2387179&req=5

Figure 2: A flowchart of benchmarks using the KKA dataset.

Mentions: The KKA dataset was used for assessing the accuracies of both alignment and consensus structure prediction. The KKA dataset is composed of the 17 RNA families listed in Table 2. A flowchart of benchmarks using the KKA dataset is shown in Figure 2. According to Kiryu et al. [32], each alignment was taken from Rfam seed alignments with the consensus structures published in the literatures. Note that we did not use sequence-based re-alignments, which were used by Kiryu et al. [32] in a different context. In addition to the alignment accuracy measured with the SPS score, the accuracy of secondary structure prediction was assessed as follows; after subjecting an alignment to three different prediction methods, Pfold [33], McCaskill-MEA [32] and RNAalifold [31], the predicted structures were compared to the Rfam structure. The internal predictions by RNA Sampler and MASTR were also compared. The difference between a predicted structure and the corresponding Rfam structure was evaluated with the MCC criterion [50],


Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework.

Katoh K, Toh H - BMC Bioinformatics (2008)

A flowchart of benchmarks using the KKA dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2387179&req=5

Figure 2: A flowchart of benchmarks using the KKA dataset.
Mentions: The KKA dataset was used for assessing the accuracies of both alignment and consensus structure prediction. The KKA dataset is composed of the 17 RNA families listed in Table 2. A flowchart of benchmarks using the KKA dataset is shown in Figure 2. According to Kiryu et al. [32], each alignment was taken from Rfam seed alignments with the consensus structures published in the literatures. Note that we did not use sequence-based re-alignments, which were used by Kiryu et al. [32] in a different context. In addition to the alignment accuracy measured with the SPS score, the accuracy of secondary structure prediction was assessed as follows; after subjecting an alignment to three different prediction methods, Pfold [33], McCaskill-MEA [32] and RNAalifold [31], the predicted structures were compared to the Rfam structure. The internal predictions by RNA Sampler and MASTR were also compared. The difference between a predicted structure and the corresponding Rfam structure was evaluated with the MCC criterion [50],

Bottom Line: In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized.The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability.The source code is available at the webpage found in the Availability and requirements section.

View Article: PubMed Central - HTML - PubMed

Affiliation: Digital Medicine Initiative, Kyushu University, Fukuoka, 812-8582, Japan. katoh@bioreg.kyushu-u.ac.jp

ABSTRACT

Background: Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable improvement in the accuracy of pairwise structural alignment. In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized.

Results: We took a different approach from a straightforward extension of the Sankoff algorithm to the multiple alignments from the viewpoints of accuracy and time complexity. As a new option of the MAFFT alignment program, we developed a multiple RNA alignment framework, X-INS-i, which builds a multiple alignment with an iterative method incorporating structural information through two components: (1) pairwise structural alignments by an external pairwise alignment method such as SCARNA or LaRA and (2) a new objective function, Four-way Consistency, derived from the base-pairing probability of every sub-aligned group at every multiple alignment stage.

Conclusion: The BRAliBASE benchmark showed that X-INS-i outperforms other methods currently available in the sum-of-pairs score (SPS) criterion. As a basis for predicting common secondary structure, the accuracy of the present method is comparable to or rather higher than those of the current leading methods such as RNA Sampler. The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability. The source code is available at the webpage found in the Availability and requirements section.

Show MeSH
Related in: MedlinePlus