Limits...
Histoplasma yeast and mycelial transcriptomes reveal pathogenic-phase and lineage-specific gene expression profiles.

Edwards JA, Chen C, Kemski MM, Hu J, Mitchell TK, Rappleye CA - BMC Genomics (2013)

Bottom Line: The close similarity in the genome sequences of these diverse strains suggests that phenotypic variations result from differences in gene expression rather than gene content.Comparison of the yeast and mycelial transcriptomes highlights genes encoding virulence factors as well as those involved in protein glycosylation, alternative metabolism, lipid remodeling, and cell wall glycanases that may contribute to Histoplasma pathogenesis.These studies lay an essential foundation for understanding how gene expression variations contribute to the strain- and phase-specific virulence differences of Histoplasma.

View Article: PubMed Central - HTML - PubMed

Affiliation: The Department of Microbiology, Ohio State University, 484 W, 12th Ave,, Columbus, OH 43210, USA. rappleye.1@osu.edu.

ABSTRACT

Background: The dimorphic fungus Histoplasma capsulatum causes respiratory and systemic disease in mammalian hosts by expression of factors that enable survival within phagocytic cells of the immune system. Histoplasma's dimorphism is distinguished by growth either as avirulent mycelia or as pathogenic yeast. Geographically distinct strains of Histoplasma differ in their relative virulence in mammalian hosts and in production of and requirement for specific virulence factors. The close similarity in the genome sequences of these diverse strains suggests that phenotypic variations result from differences in gene expression rather than gene content. To provide insight into how the transcriptional program translates into morphological variation and the pathogenic lifestyle, we compared the transcriptional profile of the pathogenic yeast phase and the non-pathogenic mycelial phase of two clinical isolates of Histoplasma.

Results: To overcome inaccuracies in ab initio genome annotation of the Histoplasma genome, we used RNA-seq methodology to generate gene structure models based on experimental evidence. Quantitative analyses of the sequencing reads revealed 6% to 9% of genes are differentially regulated between the two phases. RNA-seq-based mRNA quantitation was strongly correlated with gene expression levels determined by quantitative RT-PCR. Comparison of the yeast-phase transcriptomes between strains showed 7.6% of all genes have lineage-specific expression differences including genes contributing, or potentially related, to pathogenesis. GFP-transcriptional fusions and their introduction into both strain backgrounds revealed that the difference in transcriptional activity of individual genes reflects both variations in the cis- and trans-acting factors between Histoplasma strains.

Conclusions: Comparison of the yeast and mycelial transcriptomes highlights genes encoding virulence factors as well as those involved in protein glycosylation, alternative metabolism, lipid remodeling, and cell wall glycanases that may contribute to Histoplasma pathogenesis. These studies lay an essential foundation for understanding how gene expression variations contribute to the strain- and phase-specific virulence differences of Histoplasma.

Show MeSH

Related in: MedlinePlus

Comparison of RNA-seq-derived gene models with Histoplasma ab initio gene predictions. The accuracy of the RNA-seq-derived and ab initio gene models for G186A were measured as the frequency of mRNA reads that match the modeled gene structures (A), the percentage of exon structures with mRNA experimental support (B), and direct sequencing of mRNAs (C-E). (A) Percentages indicate the number of cDNA library reads that match to exons (blue), introns (red), intergenic regions (green), or spanning multiple regions (yellow) in the RNA-seq-derived or ab initio gene set models. (B) Accuracy of the exon definition is indicated by the percentage of exons with perfect support (blue; at least 99% of the exon length is covered by mRNA reads), fair support (red; 70% to 99% of the exon length is covered by mRNA reads), or poor support (green; less than 70% of the exon length is covered by mRNA reads). (C-E) Schematics of gene structures are shown as exons (horizontal boxes below the x-axis) for RNA-seq-derived models (red) and the ab initio predictions (blue). The horizontal represents the genome sequence in that interval. Vertical histogram (grey bars) depicts the frequency of mRNA reads that match that particular region of the genome sequence. Models are depicted for the MFS5 gene (C) that encodes an MFS-family transporter, the HYP12 gene (D) and the HYP13 gene (E), two genes encoding factors of unknown function.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3852720&req=5

Figure 3: Comparison of RNA-seq-derived gene models with Histoplasma ab initio gene predictions. The accuracy of the RNA-seq-derived and ab initio gene models for G186A were measured as the frequency of mRNA reads that match the modeled gene structures (A), the percentage of exon structures with mRNA experimental support (B), and direct sequencing of mRNAs (C-E). (A) Percentages indicate the number of cDNA library reads that match to exons (blue), introns (red), intergenic regions (green), or spanning multiple regions (yellow) in the RNA-seq-derived or ab initio gene set models. (B) Accuracy of the exon definition is indicated by the percentage of exons with perfect support (blue; at least 99% of the exon length is covered by mRNA reads), fair support (red; 70% to 99% of the exon length is covered by mRNA reads), or poor support (green; less than 70% of the exon length is covered by mRNA reads). (C-E) Schematics of gene structures are shown as exons (horizontal boxes below the x-axis) for RNA-seq-derived models (red) and the ab initio predictions (blue). The horizontal represents the genome sequence in that interval. Vertical histogram (grey bars) depicts the frequency of mRNA reads that match that particular region of the genome sequence. Models are depicted for the MFS5 gene (C) that encodes an MFS-family transporter, the HYP12 gene (D) and the HYP13 gene (E), two genes encoding factors of unknown function.

Mentions: To determine the improvement in accuracy of the gene definitions resulting from RNA-seq, we compared our G186A gene models with the current ab initio G186A gene predictions (http://www.broadinstitute.org/annotation/genome/histoplasma-_capsulatum/MultiHome.html). Transcriptome sequencing yielded 126 more genes. The total length of exon regions from RNA-seq is 17.3 Mb (56.7% of the genome), compared with 13.8 Mb (45.2% of the genome) in the ab initio predictions. To further compare the sensitivity of the gene definitions from RNA-seq with the ab initio gene models, we analyzed where mRNA reads aligned in the respective gene models (RNA-seq based or ab initio predictions). A read with > 95% of its length aligning to a region defined as an exon was considered as strong experimental validation of the locus. By these strict criteria, 72% of the G186A mRNA reads matched the RNA-seq-derived gene structures (Figure 3A). In contrast, only 54% of the mRNA reads matched the ab initio gene predictions. A similar proportion of reads aligned to intron regions in both data sets (0.47% and 1% for RNA-seq and ab initio gene models, respectively). Reads aligning to intronic or overlapping multiple region classifications are not unexpected due to partially processed RNAs in the transcriptome library and the possibility of alternative splicing events [36]. This indicates the mRNA evidence more strongly supports the RNA-seq-derived gene set compared to the ab initio gene predictions. In addition, there are notable differences in the introns defined in the RNA-seq based gene structures and the ab initio predictions. The RNA-seq data shows 90% of introns are between 54 and 237 bp in size (Table 1). The ab initio predictions are slightly broader with the middle 90% ranging from 51 to 365 base pairs. Notably, the ab initio predicted gene set introns have an overall range from 11 to 1566 bp in size, which includes 1597 introns larger than 300 bp in size. These longer and shorter intron sizes in the ab initio predictions are not supported by the mRNA reads suggesting prediction errors in the ab initio exon-intron definitions. These data indicate that the RNA-seq-based annotation greatly improves the accuracy of exon boundaries and overall gene definitions.


Histoplasma yeast and mycelial transcriptomes reveal pathogenic-phase and lineage-specific gene expression profiles.

Edwards JA, Chen C, Kemski MM, Hu J, Mitchell TK, Rappleye CA - BMC Genomics (2013)

Comparison of RNA-seq-derived gene models with Histoplasma ab initio gene predictions. The accuracy of the RNA-seq-derived and ab initio gene models for G186A were measured as the frequency of mRNA reads that match the modeled gene structures (A), the percentage of exon structures with mRNA experimental support (B), and direct sequencing of mRNAs (C-E). (A) Percentages indicate the number of cDNA library reads that match to exons (blue), introns (red), intergenic regions (green), or spanning multiple regions (yellow) in the RNA-seq-derived or ab initio gene set models. (B) Accuracy of the exon definition is indicated by the percentage of exons with perfect support (blue; at least 99% of the exon length is covered by mRNA reads), fair support (red; 70% to 99% of the exon length is covered by mRNA reads), or poor support (green; less than 70% of the exon length is covered by mRNA reads). (C-E) Schematics of gene structures are shown as exons (horizontal boxes below the x-axis) for RNA-seq-derived models (red) and the ab initio predictions (blue). The horizontal represents the genome sequence in that interval. Vertical histogram (grey bars) depicts the frequency of mRNA reads that match that particular region of the genome sequence. Models are depicted for the MFS5 gene (C) that encodes an MFS-family transporter, the HYP12 gene (D) and the HYP13 gene (E), two genes encoding factors of unknown function.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3852720&req=5

Figure 3: Comparison of RNA-seq-derived gene models with Histoplasma ab initio gene predictions. The accuracy of the RNA-seq-derived and ab initio gene models for G186A were measured as the frequency of mRNA reads that match the modeled gene structures (A), the percentage of exon structures with mRNA experimental support (B), and direct sequencing of mRNAs (C-E). (A) Percentages indicate the number of cDNA library reads that match to exons (blue), introns (red), intergenic regions (green), or spanning multiple regions (yellow) in the RNA-seq-derived or ab initio gene set models. (B) Accuracy of the exon definition is indicated by the percentage of exons with perfect support (blue; at least 99% of the exon length is covered by mRNA reads), fair support (red; 70% to 99% of the exon length is covered by mRNA reads), or poor support (green; less than 70% of the exon length is covered by mRNA reads). (C-E) Schematics of gene structures are shown as exons (horizontal boxes below the x-axis) for RNA-seq-derived models (red) and the ab initio predictions (blue). The horizontal represents the genome sequence in that interval. Vertical histogram (grey bars) depicts the frequency of mRNA reads that match that particular region of the genome sequence. Models are depicted for the MFS5 gene (C) that encodes an MFS-family transporter, the HYP12 gene (D) and the HYP13 gene (E), two genes encoding factors of unknown function.
Mentions: To determine the improvement in accuracy of the gene definitions resulting from RNA-seq, we compared our G186A gene models with the current ab initio G186A gene predictions (http://www.broadinstitute.org/annotation/genome/histoplasma-_capsulatum/MultiHome.html). Transcriptome sequencing yielded 126 more genes. The total length of exon regions from RNA-seq is 17.3 Mb (56.7% of the genome), compared with 13.8 Mb (45.2% of the genome) in the ab initio predictions. To further compare the sensitivity of the gene definitions from RNA-seq with the ab initio gene models, we analyzed where mRNA reads aligned in the respective gene models (RNA-seq based or ab initio predictions). A read with > 95% of its length aligning to a region defined as an exon was considered as strong experimental validation of the locus. By these strict criteria, 72% of the G186A mRNA reads matched the RNA-seq-derived gene structures (Figure 3A). In contrast, only 54% of the mRNA reads matched the ab initio gene predictions. A similar proportion of reads aligned to intron regions in both data sets (0.47% and 1% for RNA-seq and ab initio gene models, respectively). Reads aligning to intronic or overlapping multiple region classifications are not unexpected due to partially processed RNAs in the transcriptome library and the possibility of alternative splicing events [36]. This indicates the mRNA evidence more strongly supports the RNA-seq-derived gene set compared to the ab initio gene predictions. In addition, there are notable differences in the introns defined in the RNA-seq based gene structures and the ab initio predictions. The RNA-seq data shows 90% of introns are between 54 and 237 bp in size (Table 1). The ab initio predictions are slightly broader with the middle 90% ranging from 51 to 365 base pairs. Notably, the ab initio predicted gene set introns have an overall range from 11 to 1566 bp in size, which includes 1597 introns larger than 300 bp in size. These longer and shorter intron sizes in the ab initio predictions are not supported by the mRNA reads suggesting prediction errors in the ab initio exon-intron definitions. These data indicate that the RNA-seq-based annotation greatly improves the accuracy of exon boundaries and overall gene definitions.

Bottom Line: The close similarity in the genome sequences of these diverse strains suggests that phenotypic variations result from differences in gene expression rather than gene content.Comparison of the yeast and mycelial transcriptomes highlights genes encoding virulence factors as well as those involved in protein glycosylation, alternative metabolism, lipid remodeling, and cell wall glycanases that may contribute to Histoplasma pathogenesis.These studies lay an essential foundation for understanding how gene expression variations contribute to the strain- and phase-specific virulence differences of Histoplasma.

View Article: PubMed Central - HTML - PubMed

Affiliation: The Department of Microbiology, Ohio State University, 484 W, 12th Ave,, Columbus, OH 43210, USA. rappleye.1@osu.edu.

ABSTRACT

Background: The dimorphic fungus Histoplasma capsulatum causes respiratory and systemic disease in mammalian hosts by expression of factors that enable survival within phagocytic cells of the immune system. Histoplasma's dimorphism is distinguished by growth either as avirulent mycelia or as pathogenic yeast. Geographically distinct strains of Histoplasma differ in their relative virulence in mammalian hosts and in production of and requirement for specific virulence factors. The close similarity in the genome sequences of these diverse strains suggests that phenotypic variations result from differences in gene expression rather than gene content. To provide insight into how the transcriptional program translates into morphological variation and the pathogenic lifestyle, we compared the transcriptional profile of the pathogenic yeast phase and the non-pathogenic mycelial phase of two clinical isolates of Histoplasma.

Results: To overcome inaccuracies in ab initio genome annotation of the Histoplasma genome, we used RNA-seq methodology to generate gene structure models based on experimental evidence. Quantitative analyses of the sequencing reads revealed 6% to 9% of genes are differentially regulated between the two phases. RNA-seq-based mRNA quantitation was strongly correlated with gene expression levels determined by quantitative RT-PCR. Comparison of the yeast-phase transcriptomes between strains showed 7.6% of all genes have lineage-specific expression differences including genes contributing, or potentially related, to pathogenesis. GFP-transcriptional fusions and their introduction into both strain backgrounds revealed that the difference in transcriptional activity of individual genes reflects both variations in the cis- and trans-acting factors between Histoplasma strains.

Conclusions: Comparison of the yeast and mycelial transcriptomes highlights genes encoding virulence factors as well as those involved in protein glycosylation, alternative metabolism, lipid remodeling, and cell wall glycanases that may contribute to Histoplasma pathogenesis. These studies lay an essential foundation for understanding how gene expression variations contribute to the strain- and phase-specific virulence differences of Histoplasma.

Show MeSH
Related in: MedlinePlus