Limits...
Fastphylo: fast tools for phylogenetics.

Khan MA, Elias I, Sjölund E, Nylander K, Guimera RV, Schobesberger R, Schmitzberger P, Lagergren J, Arvestad L - BMC Bioinformatics (2013)

Bottom Line: We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix.Fastphylo is a fast, memory efficient, and easy to use software suite.Due to its modular architecture, fastphylo is a flexible tool for many phylogenetic studies.

View Article: PubMed Central - HTML - PubMed

Affiliation: KTH Royal Institute of Technology, Science for Life Laboratory, School of Computer Science and Communication, Department of Computational Biology, Solna, Sweden. malagori@kth.se.

ABSTRACT

Background: Distance methods are ubiquitous tools in phylogenetics. Their primary purpose may be to reconstruct evolutionary history, but they are also used as components in bioinformatic pipelines. However, poor computational efficiency has been a constraint on the applicability of distance methods on very large problem instances.

Results: We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency.

Conclusions: Fastphylo is a fast, memory efficient, and easy to use software suite. Due to its modular architecture, fastphylo is a flexible tool for many phylogenetic studies.

Show MeSH
Time and memory comparison between fnj and RapidNJ. This analysis was performed on dataset-2. fnj and RapidNJ both performed almost similar on the time analysis. Memory consumption figure shows that fnj uses slightly more memory in certain cases but the overall difference is not large.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4225504&req=5

Figure 8: Time and memory comparison between fnj and RapidNJ. This analysis was performed on dataset-2. fnj and RapidNJ both performed almost similar on the time analysis. Memory consumption figure shows that fnj uses slightly more memory in certain cases but the overall difference is not large.

Mentions: To further investigate the delay in fastdist-fnj pipe, we split the experiment into two phases: 1) compute the distance matrix separately; and 2) compute the phylogenetic tree using the distance matrix as an input to the neighbour joining tools considered in this study. The results of these investigations are formulated in Figures7 and8, respectively. Figure7 shows the time and memory comparison of NJ tools for computing the distance matrices. It is evident that RapidNJ outperforms all the other tools. It is ∼2 times faster than fastdist (see Figure7a). However, RapidNJ’s memory consumption increases quadratically with the number of sequences, while fastdist’s memory utilization increases linearly with the number of sequences (see Figure7c). In Figure7c, we report the results of RapidNJ upto 85,000 taxa. This is due to the memory limitation for computing the distance matrices for this experiment, i.e. 24 GB RAM. RapidNJ computed distance matrices for 17 gene families of size ranging from 5,000 to 85,000 sequences, while fastdist computed distance matrices for all the 20 gene families of size ranging from 5,000 to 100,000 sequences within the allocated memory. We can attribute the delay in the fastdist-fnj pipe, when compared to RapidNJ, in Figures3 and5 to the slow computation of distance matrices by fastdist program.


Fastphylo: fast tools for phylogenetics.

Khan MA, Elias I, Sjölund E, Nylander K, Guimera RV, Schobesberger R, Schmitzberger P, Lagergren J, Arvestad L - BMC Bioinformatics (2013)

Time and memory comparison between fnj and RapidNJ. This analysis was performed on dataset-2. fnj and RapidNJ both performed almost similar on the time analysis. Memory consumption figure shows that fnj uses slightly more memory in certain cases but the overall difference is not large.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4225504&req=5

Figure 8: Time and memory comparison between fnj and RapidNJ. This analysis was performed on dataset-2. fnj and RapidNJ both performed almost similar on the time analysis. Memory consumption figure shows that fnj uses slightly more memory in certain cases but the overall difference is not large.
Mentions: To further investigate the delay in fastdist-fnj pipe, we split the experiment into two phases: 1) compute the distance matrix separately; and 2) compute the phylogenetic tree using the distance matrix as an input to the neighbour joining tools considered in this study. The results of these investigations are formulated in Figures7 and8, respectively. Figure7 shows the time and memory comparison of NJ tools for computing the distance matrices. It is evident that RapidNJ outperforms all the other tools. It is ∼2 times faster than fastdist (see Figure7a). However, RapidNJ’s memory consumption increases quadratically with the number of sequences, while fastdist’s memory utilization increases linearly with the number of sequences (see Figure7c). In Figure7c, we report the results of RapidNJ upto 85,000 taxa. This is due to the memory limitation for computing the distance matrices for this experiment, i.e. 24 GB RAM. RapidNJ computed distance matrices for 17 gene families of size ranging from 5,000 to 85,000 sequences, while fastdist computed distance matrices for all the 20 gene families of size ranging from 5,000 to 100,000 sequences within the allocated memory. We can attribute the delay in the fastdist-fnj pipe, when compared to RapidNJ, in Figures3 and5 to the slow computation of distance matrices by fastdist program.

Bottom Line: We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix.Fastphylo is a fast, memory efficient, and easy to use software suite.Due to its modular architecture, fastphylo is a flexible tool for many phylogenetic studies.

View Article: PubMed Central - HTML - PubMed

Affiliation: KTH Royal Institute of Technology, Science for Life Laboratory, School of Computer Science and Communication, Department of Computational Biology, Solna, Sweden. malagori@kth.se.

ABSTRACT

Background: Distance methods are ubiquitous tools in phylogenetics. Their primary purpose may be to reconstruct evolutionary history, but they are also used as components in bioinformatic pipelines. However, poor computational efficiency has been a constraint on the applicability of distance methods on very large problem instances.

Results: We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency.

Conclusions: Fastphylo is a fast, memory efficient, and easy to use software suite. Due to its modular architecture, fastphylo is a flexible tool for many phylogenetic studies.

Show MeSH