Limits...
Reduced set of virulence genes allows high accuracy prediction of bacterial pathogenicity in humans.

Iraola G, Vazquez G, Spangenberg L, Naya H - PLoS ONE (2012)

Bottom Line: An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens.A reduced subset of highly informative genes (120) is presented and applied to an external validation set.Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions.

View Article: PubMed Central - PubMed

Affiliation: Unidad de Bioinformática, Institut Pasteur Montevideo, Montevideo, Uruguay.

ABSTRACT
Although there have been great advances in understanding bacterial pathogenesis, there is still a lack of integrative information about what makes a bacterium a human pathogen. The advent of high-throughput sequencing technologies has dramatically increased the amount of completed bacterial genomes, for both known human pathogenic and non-pathogenic strains; this information is now available to investigate genetic features that determine pathogenic phenotypes in bacteria. In this work we determined presence/absence patterns of 814 different virulence-related genes among more than 600 finished bacterial genomes from both human pathogenic and non-pathogenic strains, belonging to different taxonomic groups (i.e: Actinobacteria, Gammaproteobacteria, Firmicutes, etc.). An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens. A reduced subset of highly informative genes (120) is presented and applied to an external validation set. The statistical model was implemented in the BacFier v1.0 software (freely available at http : ==bacfier:googlecode:com=files=Bacfier v1 0:zip), that displays not only the prediction (pathogen/non-pathogen) and an associated probability for pathogenicity, but also the presence/absence vector for the analyzed genes, so it is possible to decipher the subset of virulence genes responsible for the classification on the analyzed genome. Furthermore, we discuss the biological relevance for bacterial pathogenesis of the core set of genes, corresponding to eight functional categories, all with evident and documented association with the phenotypes of interest. Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions.

Show MeSH

Related in: MedlinePlus

Phylogenetic distribution of virulence genes.Each functional category of virulence-related genes is represented as a vertical bar. Positive values denote association of a particular functional category with pathogenic species of a certain taxonomic group, while negative values with non-pathogenic species. Taxons are grouped according to phylogenetic relationships. In graph legend: ABC: ABC transporters, TCS&CH: two-component systems and chemotaxis, MOT&FLA: motility and flagellar assembly, TOX: toxins, SS: secretion systems, LPS: LPS biosynthesis.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3412846&req=5

pone-0042144-g005: Phylogenetic distribution of virulence genes.Each functional category of virulence-related genes is represented as a vertical bar. Positive values denote association of a particular functional category with pathogenic species of a certain taxonomic group, while negative values with non-pathogenic species. Taxons are grouped according to phylogenetic relationships. In graph legend: ABC: ABC transporters, TCS&CH: two-component systems and chemotaxis, MOT&FLA: motility and flagellar assembly, TOX: toxins, SS: secretion systems, LPS: LPS biosynthesis.

Mentions: Figure 5 shows normalized frequency values for genes belonging to each functional category, taking into account the phylogenetic relationships between studied taxonomic groups. Some expected patterns arise from these results, for example toxins are exclusively overrepresented in pathogenic species. This is expectable taking into account the biological purpose of toxins; it would be highly improbable that pathogenicity in a certain species was determined by the absence of a toxin that is present in the non-pathogenic species of the group. ABC transporters seem to be the most variable functional category along the phylogeny, it is positive (associated to pathogenic organisms) in Gammaproteobacteria, Betaproteobacteria and Firmicutes, and negative (associated to non-pathogenic organisms) in Alphaproteobacteria and Actinobacteria. This is coherent with the wide range of functions that ABC transporters can perform; for example the presence of aminoacid importers can be essential for pathogenesis of species that have lost biosynthetic genes, however, it is not contradictory with the presence of these kind of transporters in non-pathogenic species.


Reduced set of virulence genes allows high accuracy prediction of bacterial pathogenicity in humans.

Iraola G, Vazquez G, Spangenberg L, Naya H - PLoS ONE (2012)

Phylogenetic distribution of virulence genes.Each functional category of virulence-related genes is represented as a vertical bar. Positive values denote association of a particular functional category with pathogenic species of a certain taxonomic group, while negative values with non-pathogenic species. Taxons are grouped according to phylogenetic relationships. In graph legend: ABC: ABC transporters, TCS&CH: two-component systems and chemotaxis, MOT&FLA: motility and flagellar assembly, TOX: toxins, SS: secretion systems, LPS: LPS biosynthesis.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3412846&req=5

pone-0042144-g005: Phylogenetic distribution of virulence genes.Each functional category of virulence-related genes is represented as a vertical bar. Positive values denote association of a particular functional category with pathogenic species of a certain taxonomic group, while negative values with non-pathogenic species. Taxons are grouped according to phylogenetic relationships. In graph legend: ABC: ABC transporters, TCS&CH: two-component systems and chemotaxis, MOT&FLA: motility and flagellar assembly, TOX: toxins, SS: secretion systems, LPS: LPS biosynthesis.
Mentions: Figure 5 shows normalized frequency values for genes belonging to each functional category, taking into account the phylogenetic relationships between studied taxonomic groups. Some expected patterns arise from these results, for example toxins are exclusively overrepresented in pathogenic species. This is expectable taking into account the biological purpose of toxins; it would be highly improbable that pathogenicity in a certain species was determined by the absence of a toxin that is present in the non-pathogenic species of the group. ABC transporters seem to be the most variable functional category along the phylogeny, it is positive (associated to pathogenic organisms) in Gammaproteobacteria, Betaproteobacteria and Firmicutes, and negative (associated to non-pathogenic organisms) in Alphaproteobacteria and Actinobacteria. This is coherent with the wide range of functions that ABC transporters can perform; for example the presence of aminoacid importers can be essential for pathogenesis of species that have lost biosynthetic genes, however, it is not contradictory with the presence of these kind of transporters in non-pathogenic species.

Bottom Line: An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens.A reduced subset of highly informative genes (120) is presented and applied to an external validation set.Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions.

View Article: PubMed Central - PubMed

Affiliation: Unidad de Bioinformática, Institut Pasteur Montevideo, Montevideo, Uruguay.

ABSTRACT
Although there have been great advances in understanding bacterial pathogenesis, there is still a lack of integrative information about what makes a bacterium a human pathogen. The advent of high-throughput sequencing technologies has dramatically increased the amount of completed bacterial genomes, for both known human pathogenic and non-pathogenic strains; this information is now available to investigate genetic features that determine pathogenic phenotypes in bacteria. In this work we determined presence/absence patterns of 814 different virulence-related genes among more than 600 finished bacterial genomes from both human pathogenic and non-pathogenic strains, belonging to different taxonomic groups (i.e: Actinobacteria, Gammaproteobacteria, Firmicutes, etc.). An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens. A reduced subset of highly informative genes (120) is presented and applied to an external validation set. The statistical model was implemented in the BacFier v1.0 software (freely available at http : ==bacfier:googlecode:com=files=Bacfier v1 0:zip), that displays not only the prediction (pathogen/non-pathogen) and an associated probability for pathogenicity, but also the presence/absence vector for the analyzed genes, so it is possible to decipher the subset of virulence genes responsible for the classification on the analyzed genome. Furthermore, we discuss the biological relevance for bacterial pathogenesis of the core set of genes, corresponding to eight functional categories, all with evident and documented association with the phenotypes of interest. Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions.

Show MeSH
Related in: MedlinePlus