Limits...
Functional assessment of time course microarray data.

Nueda MJ, Sebastián P, Tarazona S, García-García F, Dopazo J, Ferrer A, Conesa A - BMC Bioinformatics (2009)

Bottom Line: Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information.Results were compared to alternative methodologies.The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics and Operation Research, University of Alicante, Ctra, San Vicente del Raspeig, S/N 03690 Alicante, Spain. mj.nueda@ua.es

ABSTRACT

Motivation: Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.

Methods: We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.

Results: Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

Show MeSH

Related in: MedlinePlus

Principal variation pattern of acute-phase response GO category in Toxicogenomics dataset analyzed by PCA-maSigFun. a) Scores plot reveals the profile of the GO-component. b) Loadings plot show gene contributions. Threshold for significant contribution are indicated by blue line. Names of positively correlated and negatively correlated significant contributing genes are indicated.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2697656&req=5

Figure 7: Principal variation pattern of acute-phase response GO category in Toxicogenomics dataset analyzed by PCA-maSigFun. a) Scores plot reveals the profile of the GO-component. b) Loadings plot show gene contributions. Threshold for significant contribution are indicated by blue line. Names of positively correlated and negatively correlated significant contributing genes are indicated.

Mentions: Analysis by PCA-maSigFun provided a much richer repertoire of functional classes. GO-based PCA transformation of gene expression data compressed transcriptional information into function-associated transcriptome patterns ("synthetic genes", referred here as "GO-components"). In most cases one or two GO-components were obtained per GO term and only in very generic classes, such as translation or ribosome, up to 3 patterns of correlated behaviors were extracted. maSigPro analysis on the matrix of these new functional variables resulted in the identification of 33 BP 15 MF and 10 CC significant features (Table 2a and Additional file 2). Interpretation of these results is facilitated by plotting the PCA scores of each maSigPro significant GO-component along with the PCA loading of the annotated genes. In this way we can identify the gene expression patterns captured by the significant GO-component (Figure 6a) and locate the most contributing genes (Figure 6b), i.e. genes that most closely follow the pattern indicated by the GO-component either with a positive (+, gene loading greater than 0), or negative correlation (-, gene loading smaller than 0). Horizontal lines indicate the threshold for significant contribution of the gene to the GO-component pattern. The PCA-maSigFun approach identified 3 different patterns of expression: i) classes that show a peak of expression on high BB and 24 hours, ii) classes that also respond at 24 hours at medium BB and iii) classes that show a early (6 hrs) regulation for both high and medium BB (Figure 6). The first pattern was found for different GO terms pointing to processes as fatty acid metabolism and oxidation (-), cell adhesion (-), amino acid metabolism (-), translation (+,-) microtubule organization (+), endopeptidase inhibitor activity (-) and vesicular fraction (+). Functions associated with the second pattern include translation (+), negative regulation of cell proliferation (+), acute inflammatory response (+,-), xenobiotic metabolic process (+,-), signal transduction (+,-), biopolymer methylation (-), maintenance of localization (+), response to toxic compound (+), iron ion binding (+,-), exopeptidase activity (+), kinase activity (+), epoxide hydrolase activity (+), ribosome (+,-). Finally, in the third pattern we found cation homeostasis (+), nitric oxide mediated signal transduction (+), copper ion binding (+) and lysosome (+). It is important to mention that, in most cases, only a subset of each GO term annotated genes showed significant contributions to the GO-component, indicating the predominant role of these genes in the determination of the pattern. In a few cases, corresponding to very general categories such as translation or ribosome, none of the annotated genes reach the threshold of significant contribution, but a continuum signal was observed, which would indicate a small but coordinated gene activity within the class. Finally, in some cases, such as xenobiotic compound and acute-phase, genes were observed that display either a positive or negative significant contribution to the component, which implies that coordination is present but with positively and negatively acting elements. For example, in the case of acute-phase, the alpha-1-glycoprotein, a positive acute phase protein, was found to have a significant contribution to the acute-phase GO-component pattern that represented gene expression activation with high BB at 24 h. Another three proteins, alpha-1-inhibitor, albumin and tripsin, known as negative acute-phase proteins [33], had significant but negative contributions to the GO pattern, which indicates an opposite pattern of expression (Figure 7). Therefore, this GO-component collects the induction of positive acute-phase proteins and the repression of negative acute-phase genes, suggesting a general activation of this cellular process.


Functional assessment of time course microarray data.

Nueda MJ, Sebastián P, Tarazona S, García-García F, Dopazo J, Ferrer A, Conesa A - BMC Bioinformatics (2009)

Principal variation pattern of acute-phase response GO category in Toxicogenomics dataset analyzed by PCA-maSigFun. a) Scores plot reveals the profile of the GO-component. b) Loadings plot show gene contributions. Threshold for significant contribution are indicated by blue line. Names of positively correlated and negatively correlated significant contributing genes are indicated.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2697656&req=5

Figure 7: Principal variation pattern of acute-phase response GO category in Toxicogenomics dataset analyzed by PCA-maSigFun. a) Scores plot reveals the profile of the GO-component. b) Loadings plot show gene contributions. Threshold for significant contribution are indicated by blue line. Names of positively correlated and negatively correlated significant contributing genes are indicated.
Mentions: Analysis by PCA-maSigFun provided a much richer repertoire of functional classes. GO-based PCA transformation of gene expression data compressed transcriptional information into function-associated transcriptome patterns ("synthetic genes", referred here as "GO-components"). In most cases one or two GO-components were obtained per GO term and only in very generic classes, such as translation or ribosome, up to 3 patterns of correlated behaviors were extracted. maSigPro analysis on the matrix of these new functional variables resulted in the identification of 33 BP 15 MF and 10 CC significant features (Table 2a and Additional file 2). Interpretation of these results is facilitated by plotting the PCA scores of each maSigPro significant GO-component along with the PCA loading of the annotated genes. In this way we can identify the gene expression patterns captured by the significant GO-component (Figure 6a) and locate the most contributing genes (Figure 6b), i.e. genes that most closely follow the pattern indicated by the GO-component either with a positive (+, gene loading greater than 0), or negative correlation (-, gene loading smaller than 0). Horizontal lines indicate the threshold for significant contribution of the gene to the GO-component pattern. The PCA-maSigFun approach identified 3 different patterns of expression: i) classes that show a peak of expression on high BB and 24 hours, ii) classes that also respond at 24 hours at medium BB and iii) classes that show a early (6 hrs) regulation for both high and medium BB (Figure 6). The first pattern was found for different GO terms pointing to processes as fatty acid metabolism and oxidation (-), cell adhesion (-), amino acid metabolism (-), translation (+,-) microtubule organization (+), endopeptidase inhibitor activity (-) and vesicular fraction (+). Functions associated with the second pattern include translation (+), negative regulation of cell proliferation (+), acute inflammatory response (+,-), xenobiotic metabolic process (+,-), signal transduction (+,-), biopolymer methylation (-), maintenance of localization (+), response to toxic compound (+), iron ion binding (+,-), exopeptidase activity (+), kinase activity (+), epoxide hydrolase activity (+), ribosome (+,-). Finally, in the third pattern we found cation homeostasis (+), nitric oxide mediated signal transduction (+), copper ion binding (+) and lysosome (+). It is important to mention that, in most cases, only a subset of each GO term annotated genes showed significant contributions to the GO-component, indicating the predominant role of these genes in the determination of the pattern. In a few cases, corresponding to very general categories such as translation or ribosome, none of the annotated genes reach the threshold of significant contribution, but a continuum signal was observed, which would indicate a small but coordinated gene activity within the class. Finally, in some cases, such as xenobiotic compound and acute-phase, genes were observed that display either a positive or negative significant contribution to the component, which implies that coordination is present but with positively and negatively acting elements. For example, in the case of acute-phase, the alpha-1-glycoprotein, a positive acute phase protein, was found to have a significant contribution to the acute-phase GO-component pattern that represented gene expression activation with high BB at 24 h. Another three proteins, alpha-1-inhibitor, albumin and tripsin, known as negative acute-phase proteins [33], had significant but negative contributions to the GO pattern, which indicates an opposite pattern of expression (Figure 7). Therefore, this GO-component collects the induction of positive acute-phase proteins and the repression of negative acute-phase genes, suggesting a general activation of this cellular process.

Bottom Line: Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information.Results were compared to alternative methodologies.The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics and Operation Research, University of Alicante, Ctra, San Vicente del Raspeig, S/N 03690 Alicante, Spain. mj.nueda@ua.es

ABSTRACT

Motivation: Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.

Methods: We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.

Results: Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

Show MeSH
Related in: MedlinePlus