Limits...
FUNYBASE: a FUNgal phYlogenomic dataBASE.

Marthey S, Aguileta G, Rodolphe F, Gendrault A, Giraud T, Fournier E, Lopez-Villavicencio M, Gautier A, Lebrun MH, Chiapello H - BMC Bioinformatics (2008)

Bottom Line: To assess the informative value of each ortholog cluster, each was compared to a reference species tree constructed using a concatenation of roughly half of the 246 sequences that are best approximated by the WAG evolutionary model.The orthologs were classified according to a topological score, which measures their ability to recover the same topology as the reference species tree.Examples of fruitful utilization of FUNYBASE for investigation of fungal phylogenetics are also presented.

View Article: PubMed Central - HTML - PubMed

Affiliation: UR MIG, INRA, B√Ętiment 233 Domaine de Vilvert 78350, Cedex, France. sylvain.marthey@jouy.inra.fr

ABSTRACT

Background: The increasing availability of fungal genome sequences provides large numbers of proteins for evolutionary and phylogenetic analyses. However the heterogeneity of data, including the quality of genome annotation and the difficulty of retrieving true orthologs, makes such investigations challenging. The aim of this study was to provide a reliable and integrated resource of orthologous gene families to perform comparative and phylogenetic analyses in fungi.

Description: FUNYBASE is a database dedicated to the analysis of fungal single-copy genes extracted from available fungal genomes sequences, their classification into reliable clusters of orthologs, and the assessment of their informative value for phylogenetic reconstruction based on amino acid sequences. The current release of FUNYBASE contains two types of protein data: (i) a complete set of protein sequences extracted from 30 public fungal genomes and classified into clusters of orthologs using a robust automated procedure, and (ii) a subset of 246 reliable ortholog clusters present as single copy genes in 21 fungal genomes. For each of these 246 ortholog clusters, phylogenetic trees were reconstructed based on their amino acid sequences. To assess the informative value of each ortholog cluster, each was compared to a reference species tree constructed using a concatenation of roughly half of the 246 sequences that are best approximated by the WAG evolutionary model. The orthologs were classified according to a topological score, which measures their ability to recover the same topology as the reference species tree. The full results of these analyses are available on-line with a user-friendly interface that allows for searches to be performed by species name, the ortholog cluster, various keywords, or using the BLAST algorithm. Examples of fruitful utilization of FUNYBASE for investigation of fungal phylogenetics are also presented.

Conclusion: FUNYBASE constitutes a novel and useful resource for two types of analyses: (i) comparative studies can be greatly facilitated by reliable clusters of orthologs across sets of user-defined fungal genomes, and (ii) phylogenetic reconstruction can be improved by identifying genes with the highest informative value at the desired taxonomic level.

Show MeSH

Related in: MedlinePlus

FUNYBASE Pipeline. Scheme showing the main steps in the construction of the ortholog clusters and their subsequent phylogenetic analysis (for more details see [10]).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2600828&req=5

Figure 1: FUNYBASE Pipeline. Scheme showing the main steps in the construction of the ortholog clusters and their subsequent phylogenetic analysis (for more details see [10]).

Mentions: - the subset of 246 families of single-copy orthologs obtained from 21 genomes with which further phylogenetic analyses were performed (Fig. 1) [11]. This subset of 21 genomes was chosen as a set of fungal genome sequences with reliable gene prediction (see Ref. [11] for more details). For each of these 246 ortholog clusters, FUNYBASE provides the amino-acid substitution model that best fits the data, the available annotation for the family, the mean identity percentage of the sequences in the family, the number of variable sites, the aligned proteins, the corresponding phylogenetic tree, and its similarity with the tree resulting from the concatenated dataset (i.e., its topological score, and index going from 0 to 100, see Ref. [11] for more details).


FUNYBASE: a FUNgal phYlogenomic dataBASE.

Marthey S, Aguileta G, Rodolphe F, Gendrault A, Giraud T, Fournier E, Lopez-Villavicencio M, Gautier A, Lebrun MH, Chiapello H - BMC Bioinformatics (2008)

FUNYBASE Pipeline. Scheme showing the main steps in the construction of the ortholog clusters and their subsequent phylogenetic analysis (for more details see [10]).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2600828&req=5

Figure 1: FUNYBASE Pipeline. Scheme showing the main steps in the construction of the ortholog clusters and their subsequent phylogenetic analysis (for more details see [10]).
Mentions: - the subset of 246 families of single-copy orthologs obtained from 21 genomes with which further phylogenetic analyses were performed (Fig. 1) [11]. This subset of 21 genomes was chosen as a set of fungal genome sequences with reliable gene prediction (see Ref. [11] for more details). For each of these 246 ortholog clusters, FUNYBASE provides the amino-acid substitution model that best fits the data, the available annotation for the family, the mean identity percentage of the sequences in the family, the number of variable sites, the aligned proteins, the corresponding phylogenetic tree, and its similarity with the tree resulting from the concatenated dataset (i.e., its topological score, and index going from 0 to 100, see Ref. [11] for more details).

Bottom Line: To assess the informative value of each ortholog cluster, each was compared to a reference species tree constructed using a concatenation of roughly half of the 246 sequences that are best approximated by the WAG evolutionary model.The orthologs were classified according to a topological score, which measures their ability to recover the same topology as the reference species tree.Examples of fruitful utilization of FUNYBASE for investigation of fungal phylogenetics are also presented.

View Article: PubMed Central - HTML - PubMed

Affiliation: UR MIG, INRA, B√Ętiment 233 Domaine de Vilvert 78350, Cedex, France. sylvain.marthey@jouy.inra.fr

ABSTRACT

Background: The increasing availability of fungal genome sequences provides large numbers of proteins for evolutionary and phylogenetic analyses. However the heterogeneity of data, including the quality of genome annotation and the difficulty of retrieving true orthologs, makes such investigations challenging. The aim of this study was to provide a reliable and integrated resource of orthologous gene families to perform comparative and phylogenetic analyses in fungi.

Description: FUNYBASE is a database dedicated to the analysis of fungal single-copy genes extracted from available fungal genomes sequences, their classification into reliable clusters of orthologs, and the assessment of their informative value for phylogenetic reconstruction based on amino acid sequences. The current release of FUNYBASE contains two types of protein data: (i) a complete set of protein sequences extracted from 30 public fungal genomes and classified into clusters of orthologs using a robust automated procedure, and (ii) a subset of 246 reliable ortholog clusters present as single copy genes in 21 fungal genomes. For each of these 246 ortholog clusters, phylogenetic trees were reconstructed based on their amino acid sequences. To assess the informative value of each ortholog cluster, each was compared to a reference species tree constructed using a concatenation of roughly half of the 246 sequences that are best approximated by the WAG evolutionary model. The orthologs were classified according to a topological score, which measures their ability to recover the same topology as the reference species tree. The full results of these analyses are available on-line with a user-friendly interface that allows for searches to be performed by species name, the ortholog cluster, various keywords, or using the BLAST algorithm. Examples of fruitful utilization of FUNYBASE for investigation of fungal phylogenetics are also presented.

Conclusion: FUNYBASE constitutes a novel and useful resource for two types of analyses: (i) comparative studies can be greatly facilitated by reliable clusters of orthologs across sets of user-defined fungal genomes, and (ii) phylogenetic reconstruction can be improved by identifying genes with the highest informative value at the desired taxonomic level.

Show MeSH
Related in: MedlinePlus