Limits...
Novel algorithms reveal streptococcal transcriptomes and clues about undefined genes.

Ryan PA, Kirk BW, Euler CW, Schuch R, Fischetti VA - PLoS Comput. Biol. (2007)

Bottom Line: We applied this method to our own data and to those of others, and we show that it identified a greater number of differentially expressed genes, facilitating the reconstruction of more multimeric proteins and complete metabolic pathways than would have been possible without its application.We assessed the biological significance of two identified genes by assaying deletion mutants for adherence in vitro and show that neighbor clustering indeed provides biologically relevant data.Neighbor clustering provides a more comprehensive view of the molecular responses of streptococci during pharyngeal cell adherence.

View Article: PubMed Central - PubMed

Affiliation: Department of Bacterial Pathogenesis and Immunology, Rockefeller University, New York, New York, USA. ryanp@mail.rockfeller.edu

ABSTRACT
Bacteria-host interactions are dynamic processes, and understanding transcriptional responses that directly or indirectly regulate the expression of genes involved in initial infection stages would illuminate the molecular events that result in host colonization. We used oligonucleotide microarrays to monitor (in vitro) differential gene expression in group A streptococci during pharyngeal cell adherence, the first overt infection stage. We present neighbor clustering, a new computational method for further analyzing bacterial microarray data that combines two informative characteristics of bacterial genes that share common function or regulation: (1) similar gene expression profiles (i.e., co-expression); and (2) physical proximity of genes on the chromosome. This method identifies statistically significant clusters of co-expressed gene neighbors that potentially share common function or regulation by coupling statistically analyzed gene expression profiles with the chromosomal position of genes. We applied this method to our own data and to those of others, and we show that it identified a greater number of differentially expressed genes, facilitating the reconstruction of more multimeric proteins and complete metabolic pathways than would have been possible without its application. We assessed the biological significance of two identified genes by assaying deletion mutants for adherence in vitro and show that neighbor clustering indeed provides biologically relevant data. Neighbor clustering provides a more comprehensive view of the molecular responses of streptococci during pharyngeal cell adherence.

Show MeSH

Related in: MedlinePlus

Statistically Significant Neighbor Clusters in the SF370 GenomeNeighbor clusters that adhere to the definition of neighbor clusters are plotted by GenomeSpyer. Yellow boxes denote boundaries of significant neighbor clusters (PK value < 0.05). Genes, located on the x-axis, are identified by their Spy numbers from the annotated SF370 genome (deleted gene numbers result during genome updates); log2-fold change in expression values (adherence versus associated streptococci) are indicated on the y-axis. Genes designated by green lines have statistically significant PF values (log2-fold change P values < 0.05) and PE values (expression P values < 0.05). Genes designated by blue lines do not have statistically significant PF values, but as a result of membership in a designated neighbor cluster have statistically significant PE values (PE values < 0.05). Genes designated by gray lines do not have statistically significant PF or PE values.(A) Whole-genome view of 47 statistically significant neighbor clusters identified in the SF370 genome during adherence to pharyngeal cells.(B) Enlarged view of representative Type I cluster encoding folate biosynthesis genes (spy1096–1100). Type I clusters contain only genes of known or defined function. See text for further descriptions of all clusters. Spy numbers are indicated above the bars corresponding to each gene.(C) Enlarged view of representative Type II cluster containing spy0127–0130. Type II clusters contain a combination of both functionally defined and unknown gene members.(D) Enlarged view of representative Type III cluster containing phage encoded genes of unknown function (spy0961–0965). Type III clusters contain only genes of unknown or undefined function.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1913099&req=5

pcbi-0030132-g002: Statistically Significant Neighbor Clusters in the SF370 GenomeNeighbor clusters that adhere to the definition of neighbor clusters are plotted by GenomeSpyer. Yellow boxes denote boundaries of significant neighbor clusters (PK value < 0.05). Genes, located on the x-axis, are identified by their Spy numbers from the annotated SF370 genome (deleted gene numbers result during genome updates); log2-fold change in expression values (adherence versus associated streptococci) are indicated on the y-axis. Genes designated by green lines have statistically significant PF values (log2-fold change P values < 0.05) and PE values (expression P values < 0.05). Genes designated by blue lines do not have statistically significant PF values, but as a result of membership in a designated neighbor cluster have statistically significant PE values (PE values < 0.05). Genes designated by gray lines do not have statistically significant PF or PE values.(A) Whole-genome view of 47 statistically significant neighbor clusters identified in the SF370 genome during adherence to pharyngeal cells.(B) Enlarged view of representative Type I cluster encoding folate biosynthesis genes (spy1096–1100). Type I clusters contain only genes of known or defined function. See text for further descriptions of all clusters. Spy numbers are indicated above the bars corresponding to each gene.(C) Enlarged view of representative Type II cluster containing spy0127–0130. Type II clusters contain a combination of both functionally defined and unknown gene members.(D) Enlarged view of representative Type III cluster containing phage encoded genes of unknown function (spy0961–0965). Type III clusters contain only genes of unknown or undefined function.

Mentions: We visually inspected the resulting clusters and disqualified those that violated our neighbor cluster definition (see Methods for details). All output prior to cluster disqualifications is included for comparison (see Table S4). Of the 309 qualifying clusters (Table S5), 197 (63.8%) were composed entirely of known, functionally defined genes; however, 26 (13%) of these were incorrectly assembled, as they contained known genes that are functionally unrelated. Because we did not incorporate functional annotations of genes into the algorithms (i.e., to keep the analysis “blind”), we anticipated the possibility that some groupings could be assembled incorrectly despite the statistical framework for assigning clusters. Of the remaining 283 (91.6%) groupings, a number of differently sized clusters contained the same gene (Table S5). We report such clusters first by highest significance (lowest PK value), then by largest number of genes. Thus, if clusters containing a particular gene were of equal significance, we report the cluster with the most gene members. This method identified 47 significant clusters containing 173 differentially expressed genes (listed in Table 3 and visualized in Figures 2 and S2–S4), a considerably larger group than could have been compiled using only the initial 79 significant genes. A total of 56 of the original 79 significant genes became components of significant clusters, whereas 23 remained unclustered.


Novel algorithms reveal streptococcal transcriptomes and clues about undefined genes.

Ryan PA, Kirk BW, Euler CW, Schuch R, Fischetti VA - PLoS Comput. Biol. (2007)

Statistically Significant Neighbor Clusters in the SF370 GenomeNeighbor clusters that adhere to the definition of neighbor clusters are plotted by GenomeSpyer. Yellow boxes denote boundaries of significant neighbor clusters (PK value < 0.05). Genes, located on the x-axis, are identified by their Spy numbers from the annotated SF370 genome (deleted gene numbers result during genome updates); log2-fold change in expression values (adherence versus associated streptococci) are indicated on the y-axis. Genes designated by green lines have statistically significant PF values (log2-fold change P values < 0.05) and PE values (expression P values < 0.05). Genes designated by blue lines do not have statistically significant PF values, but as a result of membership in a designated neighbor cluster have statistically significant PE values (PE values < 0.05). Genes designated by gray lines do not have statistically significant PF or PE values.(A) Whole-genome view of 47 statistically significant neighbor clusters identified in the SF370 genome during adherence to pharyngeal cells.(B) Enlarged view of representative Type I cluster encoding folate biosynthesis genes (spy1096–1100). Type I clusters contain only genes of known or defined function. See text for further descriptions of all clusters. Spy numbers are indicated above the bars corresponding to each gene.(C) Enlarged view of representative Type II cluster containing spy0127–0130. Type II clusters contain a combination of both functionally defined and unknown gene members.(D) Enlarged view of representative Type III cluster containing phage encoded genes of unknown function (spy0961–0965). Type III clusters contain only genes of unknown or undefined function.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1913099&req=5

pcbi-0030132-g002: Statistically Significant Neighbor Clusters in the SF370 GenomeNeighbor clusters that adhere to the definition of neighbor clusters are plotted by GenomeSpyer. Yellow boxes denote boundaries of significant neighbor clusters (PK value < 0.05). Genes, located on the x-axis, are identified by their Spy numbers from the annotated SF370 genome (deleted gene numbers result during genome updates); log2-fold change in expression values (adherence versus associated streptococci) are indicated on the y-axis. Genes designated by green lines have statistically significant PF values (log2-fold change P values < 0.05) and PE values (expression P values < 0.05). Genes designated by blue lines do not have statistically significant PF values, but as a result of membership in a designated neighbor cluster have statistically significant PE values (PE values < 0.05). Genes designated by gray lines do not have statistically significant PF or PE values.(A) Whole-genome view of 47 statistically significant neighbor clusters identified in the SF370 genome during adherence to pharyngeal cells.(B) Enlarged view of representative Type I cluster encoding folate biosynthesis genes (spy1096–1100). Type I clusters contain only genes of known or defined function. See text for further descriptions of all clusters. Spy numbers are indicated above the bars corresponding to each gene.(C) Enlarged view of representative Type II cluster containing spy0127–0130. Type II clusters contain a combination of both functionally defined and unknown gene members.(D) Enlarged view of representative Type III cluster containing phage encoded genes of unknown function (spy0961–0965). Type III clusters contain only genes of unknown or undefined function.
Mentions: We visually inspected the resulting clusters and disqualified those that violated our neighbor cluster definition (see Methods for details). All output prior to cluster disqualifications is included for comparison (see Table S4). Of the 309 qualifying clusters (Table S5), 197 (63.8%) were composed entirely of known, functionally defined genes; however, 26 (13%) of these were incorrectly assembled, as they contained known genes that are functionally unrelated. Because we did not incorporate functional annotations of genes into the algorithms (i.e., to keep the analysis “blind”), we anticipated the possibility that some groupings could be assembled incorrectly despite the statistical framework for assigning clusters. Of the remaining 283 (91.6%) groupings, a number of differently sized clusters contained the same gene (Table S5). We report such clusters first by highest significance (lowest PK value), then by largest number of genes. Thus, if clusters containing a particular gene were of equal significance, we report the cluster with the most gene members. This method identified 47 significant clusters containing 173 differentially expressed genes (listed in Table 3 and visualized in Figures 2 and S2–S4), a considerably larger group than could have been compiled using only the initial 79 significant genes. A total of 56 of the original 79 significant genes became components of significant clusters, whereas 23 remained unclustered.

Bottom Line: We applied this method to our own data and to those of others, and we show that it identified a greater number of differentially expressed genes, facilitating the reconstruction of more multimeric proteins and complete metabolic pathways than would have been possible without its application.We assessed the biological significance of two identified genes by assaying deletion mutants for adherence in vitro and show that neighbor clustering indeed provides biologically relevant data.Neighbor clustering provides a more comprehensive view of the molecular responses of streptococci during pharyngeal cell adherence.

View Article: PubMed Central - PubMed

Affiliation: Department of Bacterial Pathogenesis and Immunology, Rockefeller University, New York, New York, USA. ryanp@mail.rockfeller.edu

ABSTRACT
Bacteria-host interactions are dynamic processes, and understanding transcriptional responses that directly or indirectly regulate the expression of genes involved in initial infection stages would illuminate the molecular events that result in host colonization. We used oligonucleotide microarrays to monitor (in vitro) differential gene expression in group A streptococci during pharyngeal cell adherence, the first overt infection stage. We present neighbor clustering, a new computational method for further analyzing bacterial microarray data that combines two informative characteristics of bacterial genes that share common function or regulation: (1) similar gene expression profiles (i.e., co-expression); and (2) physical proximity of genes on the chromosome. This method identifies statistically significant clusters of co-expressed gene neighbors that potentially share common function or regulation by coupling statistically analyzed gene expression profiles with the chromosomal position of genes. We applied this method to our own data and to those of others, and we show that it identified a greater number of differentially expressed genes, facilitating the reconstruction of more multimeric proteins and complete metabolic pathways than would have been possible without its application. We assessed the biological significance of two identified genes by assaying deletion mutants for adherence in vitro and show that neighbor clustering indeed provides biologically relevant data. Neighbor clustering provides a more comprehensive view of the molecular responses of streptococci during pharyngeal cell adherence.

Show MeSH
Related in: MedlinePlus