Limits...
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

Ndhlovu A, Durand PM, Hazelhurst S - Database (Oxford) (2015)

Bottom Line: To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled.Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality.EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases.

View Article: PubMed Central - PubMed

Affiliation: Evolutionary Medicine Laboratory, Department of Molecular Medicine and Haematology, Faculty of Health Sciences, Sydney Brenner Institute of Molecular Bioscience, The Mount, 9 Jubilee Road, Parktown 2193, Johannesburg, South Africa, andrew.ndhlovu@students.wits.ac.za.

No MeSH data available.


Related in: MedlinePlus

The EvoDB web interface allows for easy query and download of data. The database can be queried using PFAM-A domain identifiers and accession identifiers. The results shown here are for the tumor suppressor p53 domain. The CODEML ‘mlc’ and ‘rst’ analysis results for the M1a and M2ac models are provided and a summary of results is provided for viewing. Graphical plots of evolutionary rate profiles can also be viewed or downloaded in various picture file formats. EvoDB provides an interface for downloading the corresponding nucleotide sequences of PFAM protein domain families.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4492416&req=5

bav065-F2: The EvoDB web interface allows for easy query and download of data. The database can be queried using PFAM-A domain identifiers and accession identifiers. The results shown here are for the tumor suppressor p53 domain. The CODEML ‘mlc’ and ‘rst’ analysis results for the M1a and M2ac models are provided and a summary of results is provided for viewing. Graphical plots of evolutionary rate profiles can also be viewed or downloaded in various picture file formats. EvoDB provides an interface for downloading the corresponding nucleotide sequences of PFAM protein domain families.

Mentions: EvoDB is a flat file database of evolutionary rate profiles, associated gapped nucleotide alignment, phylogenetic trees and corresponding PFAM alignments for the PFAM-A seed alignments database. The database statistics are provided in Table 1. EvoDB contains a total of 501,375 nucleotide sequences, indicating that 176,757 (26%) could not be retrieved, this was mostly due to annotation errors, an increasing challenge which has not been addressed since the work of (8). Additionally, the corresponding phylogenetic trees, PFAM-A alignments and accession identifier data on all sequences including those that could not be retrieved are provided in the database. Evolutionary rates profiles were determined for 97.1% of PFAM-A entries under the M2a model. In addition to these profiles, CODEML analysis results for the M1a and M2a models are provided for comparison and hypothesis testing. Future versions of EvoDB will provide data for M0, M7 and M8 models. The efficacy of the model used to determine this evolutionary profile can be assessed by using the log-likelihood values or the Likelihood Ratio Test (LRT) (3) using the CODEML ‘mlc’ and ‘rst’ files provided. While we provide the evolutionary rate profiles (under M2a) for all the domains in EvoDB, the caveat is that calculation of dN/dS may be inappropriate for sequences that may have become highly diverged, say over millions of years or for closely related sequences. We suggest a criterion for total branch dS in the range of 0.1 and 0.9 found in the CODEML ‘mlc’ file, those domains not meeting this criterion may not be appropriate for dN/dS calculation. Users of the web interface are cautioned if a domain has a sequence length less than the 100 nucleotides or a total dS value outside the criterion. However, we provide this as a guideline and suggest caution and further interrogation when using dN/dS profiles from those domains that do not meet this criterion. On the other hand, the sequence data and trees are provided; therefore, different models can be run and assessed using the log-likelihood values or the LRT (3). The web interface for EvoDB was developed with PHP and JavaScript and can be queried by PFAM accession numbers or identifiers. Query results provide links to all the EvoDB data for the corresponding domain (Figure 2). The EvoDB database and release notes are available for download at http://www.bioinf.wits.ac.za/software/fire/evodb.Figure 2.


EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

Ndhlovu A, Durand PM, Hazelhurst S - Database (Oxford) (2015)

The EvoDB web interface allows for easy query and download of data. The database can be queried using PFAM-A domain identifiers and accession identifiers. The results shown here are for the tumor suppressor p53 domain. The CODEML ‘mlc’ and ‘rst’ analysis results for the M1a and M2ac models are provided and a summary of results is provided for viewing. Graphical plots of evolutionary rate profiles can also be viewed or downloaded in various picture file formats. EvoDB provides an interface for downloading the corresponding nucleotide sequences of PFAM protein domain families.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4492416&req=5

bav065-F2: The EvoDB web interface allows for easy query and download of data. The database can be queried using PFAM-A domain identifiers and accession identifiers. The results shown here are for the tumor suppressor p53 domain. The CODEML ‘mlc’ and ‘rst’ analysis results for the M1a and M2ac models are provided and a summary of results is provided for viewing. Graphical plots of evolutionary rate profiles can also be viewed or downloaded in various picture file formats. EvoDB provides an interface for downloading the corresponding nucleotide sequences of PFAM protein domain families.
Mentions: EvoDB is a flat file database of evolutionary rate profiles, associated gapped nucleotide alignment, phylogenetic trees and corresponding PFAM alignments for the PFAM-A seed alignments database. The database statistics are provided in Table 1. EvoDB contains a total of 501,375 nucleotide sequences, indicating that 176,757 (26%) could not be retrieved, this was mostly due to annotation errors, an increasing challenge which has not been addressed since the work of (8). Additionally, the corresponding phylogenetic trees, PFAM-A alignments and accession identifier data on all sequences including those that could not be retrieved are provided in the database. Evolutionary rates profiles were determined for 97.1% of PFAM-A entries under the M2a model. In addition to these profiles, CODEML analysis results for the M1a and M2a models are provided for comparison and hypothesis testing. Future versions of EvoDB will provide data for M0, M7 and M8 models. The efficacy of the model used to determine this evolutionary profile can be assessed by using the log-likelihood values or the Likelihood Ratio Test (LRT) (3) using the CODEML ‘mlc’ and ‘rst’ files provided. While we provide the evolutionary rate profiles (under M2a) for all the domains in EvoDB, the caveat is that calculation of dN/dS may be inappropriate for sequences that may have become highly diverged, say over millions of years or for closely related sequences. We suggest a criterion for total branch dS in the range of 0.1 and 0.9 found in the CODEML ‘mlc’ file, those domains not meeting this criterion may not be appropriate for dN/dS calculation. Users of the web interface are cautioned if a domain has a sequence length less than the 100 nucleotides or a total dS value outside the criterion. However, we provide this as a guideline and suggest caution and further interrogation when using dN/dS profiles from those domains that do not meet this criterion. On the other hand, the sequence data and trees are provided; therefore, different models can be run and assessed using the log-likelihood values or the LRT (3). The web interface for EvoDB was developed with PHP and JavaScript and can be queried by PFAM accession numbers or identifiers. Query results provide links to all the EvoDB data for the corresponding domain (Figure 2). The EvoDB database and release notes are available for download at http://www.bioinf.wits.ac.za/software/fire/evodb.Figure 2.

Bottom Line: To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled.Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality.EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases.

View Article: PubMed Central - PubMed

Affiliation: Evolutionary Medicine Laboratory, Department of Molecular Medicine and Haematology, Faculty of Health Sciences, Sydney Brenner Institute of Molecular Bioscience, The Mount, 9 Jubilee Road, Parktown 2193, Johannesburg, South Africa, andrew.ndhlovu@students.wits.ac.za.

No MeSH data available.


Related in: MedlinePlus