Limits...
Identification of biomarkers for tuberculosis susceptibility via integrated analysis of gene expression and longitudinal clinical data.

Luo Q, Mehra S, Golden NA, Kaushal D, Lacey MR - Front Genet (2014)

Bottom Line: The clinical profiles associated with the animals following Mtb exposure revealed considerable variability, and we developed models for the disease trajectory for each subject using a Bayesian hierarchical B-spline approach.Disease severity estimates were derived from these fitted curves and included as covariates in linear models to identify genes significantly associated with disease progression.Our results demonstrate that the incorporation of clinical data increases the value of information extracted from the expression profiles and contributes to the identification of predictive biomarkers for TB susceptibility.

View Article: PubMed Central - PubMed

Affiliation: Mathematics Department, Tulane University New Orleans, LA, USA.

ABSTRACT
Tuberculosis (TB) is an infectious disease caused by the bacteria Mycobacterium tuberculosis (Mtb) that affects millions of people worldwide. The majority of individuals who are exposed to Mtb develop latent infections, in which an immunological response to Mtb antigens is present but there is no clinical evidence of disease. Because currently available tests cannot differentiate latent individuals who are at low risk from those who are highly susceptible to developing active disease, there is considerable interest in the identification of diagnostic biomarkers that can predict reactivation of latent TB. We present results from our analysis of a controlled longitudinal experiment in which a group of rhesus macaques were exposed to a low dose of Mtb to study their progression to latent infection or active disease. Subsets of the animals were then euthanized at scheduled time points, and granulomas taken from their lungs were assayed for gene expression using microarrays. The clinical profiles associated with the animals following Mtb exposure revealed considerable variability, and we developed models for the disease trajectory for each subject using a Bayesian hierarchical B-spline approach. Disease severity estimates were derived from these fitted curves and included as covariates in linear models to identify genes significantly associated with disease progression. Our results demonstrate that the incorporation of clinical data increases the value of information extracted from the expression profiles and contributes to the identification of predictive biomarkers for TB susceptibility.

No MeSH data available.


Related in: MedlinePlus

Enriched SP-PIR keywords for the set of unique gene IDs associated with post-exposure time T and/or final severity score S at the α = 0.05 significance level. For each keyword, the distribution of associated genes by model cluster is displayed using colored bars, with Cluster 1 on the left (light blue) and Cluster 6 on the right (yellow).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4109430&req=5

Figure 5: Enriched SP-PIR keywords for the set of unique gene IDs associated with post-exposure time T and/or final severity score S at the α = 0.05 significance level. For each keyword, the distribution of associated genes by model cluster is displayed using colored bars, with Cluster 1 on the left (light blue) and Cluster 6 on the right (yellow).

Mentions: The set of all unique gene IDs included in the six clusters was imported into the DAVID bioinformatics tool suite and analyzed for functional annotation. We found that this subset was significantly enriched for 181 GO Biological Process (BP) terms, 50 Cellular Component (CC) terms, 7 GO Molecular Function (MF) terms, 8 KEGG Pathways, 77 Swiss-Prot Protein Information Resource (SP-PIR) Keywords, and 5 UniProt Sequence Annotation (UP-SEQ) Features. To avoid redundancy, we present results for the 56 statistically significant SP-PIR keywords that were unambiguously defined in the Swiss-Prot controlled vocabulary of keywords (www.uniprot.org/docs/keywlist). Figure 5 displays these terms along with the relative proportion of gene IDs included within each expression profile cluster. Based on the observed numbers of gene IDs in each cluster, if the gene IDs represented by a given keyword were randomly associated with the set of six clusters we would expect the percentage of gene IDs by cluster to be distributed as follows: 42.4% in Cluster 1, 2.2% in Cluster 2, 5.5% in Cluster 3, 39.7% in Cluster 4, 1.6% in Cluster 5, and 8.6% in Cluster 6. Chi-Square tests for random association of enriched genes with cluster membership identified 24 terms that were consistent with the expected cluster distribution, while the remaining 32 terms deviated significantly from the expected proportions. For a few terms, the deviations reflected an imbalance of gene IDs associated with quadratic temporal increases or decreases (Clusters 2 and 4), which would be expected to be observed in nearly equal proportions. For example, for the keyword “Hormone” 31 of 33 gene IDs were associated with Cluster 2 (p = 0.005) while 14 of the 15 gene IDs associated with the keyword “Ubiquinone” were contained in Cluster 4.


Identification of biomarkers for tuberculosis susceptibility via integrated analysis of gene expression and longitudinal clinical data.

Luo Q, Mehra S, Golden NA, Kaushal D, Lacey MR - Front Genet (2014)

Enriched SP-PIR keywords for the set of unique gene IDs associated with post-exposure time T and/or final severity score S at the α = 0.05 significance level. For each keyword, the distribution of associated genes by model cluster is displayed using colored bars, with Cluster 1 on the left (light blue) and Cluster 6 on the right (yellow).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4109430&req=5

Figure 5: Enriched SP-PIR keywords for the set of unique gene IDs associated with post-exposure time T and/or final severity score S at the α = 0.05 significance level. For each keyword, the distribution of associated genes by model cluster is displayed using colored bars, with Cluster 1 on the left (light blue) and Cluster 6 on the right (yellow).
Mentions: The set of all unique gene IDs included in the six clusters was imported into the DAVID bioinformatics tool suite and analyzed for functional annotation. We found that this subset was significantly enriched for 181 GO Biological Process (BP) terms, 50 Cellular Component (CC) terms, 7 GO Molecular Function (MF) terms, 8 KEGG Pathways, 77 Swiss-Prot Protein Information Resource (SP-PIR) Keywords, and 5 UniProt Sequence Annotation (UP-SEQ) Features. To avoid redundancy, we present results for the 56 statistically significant SP-PIR keywords that were unambiguously defined in the Swiss-Prot controlled vocabulary of keywords (www.uniprot.org/docs/keywlist). Figure 5 displays these terms along with the relative proportion of gene IDs included within each expression profile cluster. Based on the observed numbers of gene IDs in each cluster, if the gene IDs represented by a given keyword were randomly associated with the set of six clusters we would expect the percentage of gene IDs by cluster to be distributed as follows: 42.4% in Cluster 1, 2.2% in Cluster 2, 5.5% in Cluster 3, 39.7% in Cluster 4, 1.6% in Cluster 5, and 8.6% in Cluster 6. Chi-Square tests for random association of enriched genes with cluster membership identified 24 terms that were consistent with the expected cluster distribution, while the remaining 32 terms deviated significantly from the expected proportions. For a few terms, the deviations reflected an imbalance of gene IDs associated with quadratic temporal increases or decreases (Clusters 2 and 4), which would be expected to be observed in nearly equal proportions. For example, for the keyword “Hormone” 31 of 33 gene IDs were associated with Cluster 2 (p = 0.005) while 14 of the 15 gene IDs associated with the keyword “Ubiquinone” were contained in Cluster 4.

Bottom Line: The clinical profiles associated with the animals following Mtb exposure revealed considerable variability, and we developed models for the disease trajectory for each subject using a Bayesian hierarchical B-spline approach.Disease severity estimates were derived from these fitted curves and included as covariates in linear models to identify genes significantly associated with disease progression.Our results demonstrate that the incorporation of clinical data increases the value of information extracted from the expression profiles and contributes to the identification of predictive biomarkers for TB susceptibility.

View Article: PubMed Central - PubMed

Affiliation: Mathematics Department, Tulane University New Orleans, LA, USA.

ABSTRACT
Tuberculosis (TB) is an infectious disease caused by the bacteria Mycobacterium tuberculosis (Mtb) that affects millions of people worldwide. The majority of individuals who are exposed to Mtb develop latent infections, in which an immunological response to Mtb antigens is present but there is no clinical evidence of disease. Because currently available tests cannot differentiate latent individuals who are at low risk from those who are highly susceptible to developing active disease, there is considerable interest in the identification of diagnostic biomarkers that can predict reactivation of latent TB. We present results from our analysis of a controlled longitudinal experiment in which a group of rhesus macaques were exposed to a low dose of Mtb to study their progression to latent infection or active disease. Subsets of the animals were then euthanized at scheduled time points, and granulomas taken from their lungs were assayed for gene expression using microarrays. The clinical profiles associated with the animals following Mtb exposure revealed considerable variability, and we developed models for the disease trajectory for each subject using a Bayesian hierarchical B-spline approach. Disease severity estimates were derived from these fitted curves and included as covariates in linear models to identify genes significantly associated with disease progression. Our results demonstrate that the incorporation of clinical data increases the value of information extracted from the expression profiles and contributes to the identification of predictive biomarkers for TB susceptibility.

No MeSH data available.


Related in: MedlinePlus