Limits...
Functional assessment of time course microarray data.

Nueda MJ, Sebastián P, Tarazona S, García-García F, Dopazo J, Ferrer A, Conesa A - BMC Bioinformatics (2009)

Bottom Line: Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information.Results were compared to alternative methodologies.The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics and Operation Research, University of Alicante, Ctra, San Vicente del Raspeig, S/N 03690 Alicante, Spain. mj.nueda@ua.es

ABSTRACT

Motivation: Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.

Methods: We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.

Results: Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

Show MeSH

Related in: MedlinePlus

Schematic representation of the proposed methods. a) maSigFun fits a regression model for each gene expression submatrix defined by the genes annotated to a given functional class (FC.1 to 4 in scheme). Significant classes are obtained by the maSigPro method (FC.3). b) PCA-maSigFun obtains a PCA model for the gene expression submatrix defined as in maSigFun and extracts a number of components that collect non-random variation. Generally 0 (FC.1) to 2 (FC.2) components are extracted for each functional class. A regression model is then fitted to the scores vector of extracted components to select function-defined patterns with a significant association to time (FC.2 and FC.3). c) ASCA-functional applies ASCA-genes to identify principal patterns of variation associated with time and time × treatment experimental factors (PC1 to 3 in scheme). Genes are ranked by loading value in each PC, and GSA analysis is applied to each loading value-ordered gene list to identify a functionally related block of genes associated with the principal patterns of variation (FC.2 and FC.3).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2697656&req=5

Figure 1: Schematic representation of the proposed methods. a) maSigFun fits a regression model for each gene expression submatrix defined by the genes annotated to a given functional class (FC.1 to 4 in scheme). Significant classes are obtained by the maSigPro method (FC.3). b) PCA-maSigFun obtains a PCA model for the gene expression submatrix defined as in maSigFun and extracts a number of components that collect non-random variation. Generally 0 (FC.1) to 2 (FC.2) components are extracted for each functional class. A regression model is then fitted to the scores vector of extracted components to select function-defined patterns with a significant association to time (FC.2 and FC.3). c) ASCA-functional applies ASCA-genes to identify principal patterns of variation associated with time and time × treatment experimental factors (PC1 to 3 in scheme). Genes are ranked by loading value in each PC, and GSA analysis is applied to each loading value-ordered gene list to identify a functionally related block of genes associated with the principal patterns of variation (FC.2 and FC.3).

Mentions: The adaptation of maSigPro to consider functional information -maSigFun- is quite straightforward: the regression model is not fitted gene-wise as in maSigPro, but to the data matrix composed the expression values of all genes belonging to the functional class, thus one regression model is fitted to each functional category. In this approach individual genes are considered as different observations of the expression profile of the class. As genes belonging to the same class may show different basal expression levels and this may negatively influence the estimation of model parameters, expression data is standardized gene-wise to better capture the correlation structure within the functional group. After this transformation, statistical analysis proceeds as in regular maSigPro (Figure 1a). The expected result is that significant functional classes are those whose genes change their expression along time in the same manner, i.e. a high level of co-expression is present within the functional class.


Functional assessment of time course microarray data.

Nueda MJ, Sebastián P, Tarazona S, García-García F, Dopazo J, Ferrer A, Conesa A - BMC Bioinformatics (2009)

Schematic representation of the proposed methods. a) maSigFun fits a regression model for each gene expression submatrix defined by the genes annotated to a given functional class (FC.1 to 4 in scheme). Significant classes are obtained by the maSigPro method (FC.3). b) PCA-maSigFun obtains a PCA model for the gene expression submatrix defined as in maSigFun and extracts a number of components that collect non-random variation. Generally 0 (FC.1) to 2 (FC.2) components are extracted for each functional class. A regression model is then fitted to the scores vector of extracted components to select function-defined patterns with a significant association to time (FC.2 and FC.3). c) ASCA-functional applies ASCA-genes to identify principal patterns of variation associated with time and time × treatment experimental factors (PC1 to 3 in scheme). Genes are ranked by loading value in each PC, and GSA analysis is applied to each loading value-ordered gene list to identify a functionally related block of genes associated with the principal patterns of variation (FC.2 and FC.3).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2697656&req=5

Figure 1: Schematic representation of the proposed methods. a) maSigFun fits a regression model for each gene expression submatrix defined by the genes annotated to a given functional class (FC.1 to 4 in scheme). Significant classes are obtained by the maSigPro method (FC.3). b) PCA-maSigFun obtains a PCA model for the gene expression submatrix defined as in maSigFun and extracts a number of components that collect non-random variation. Generally 0 (FC.1) to 2 (FC.2) components are extracted for each functional class. A regression model is then fitted to the scores vector of extracted components to select function-defined patterns with a significant association to time (FC.2 and FC.3). c) ASCA-functional applies ASCA-genes to identify principal patterns of variation associated with time and time × treatment experimental factors (PC1 to 3 in scheme). Genes are ranked by loading value in each PC, and GSA analysis is applied to each loading value-ordered gene list to identify a functionally related block of genes associated with the principal patterns of variation (FC.2 and FC.3).
Mentions: The adaptation of maSigPro to consider functional information -maSigFun- is quite straightforward: the regression model is not fitted gene-wise as in maSigPro, but to the data matrix composed the expression values of all genes belonging to the functional class, thus one regression model is fitted to each functional category. In this approach individual genes are considered as different observations of the expression profile of the class. As genes belonging to the same class may show different basal expression levels and this may negatively influence the estimation of model parameters, expression data is standardized gene-wise to better capture the correlation structure within the functional group. After this transformation, statistical analysis proceeds as in regular maSigPro (Figure 1a). The expected result is that significant functional classes are those whose genes change their expression along time in the same manner, i.e. a high level of co-expression is present within the functional class.

Bottom Line: Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information.Results were compared to alternative methodologies.The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics and Operation Research, University of Alicante, Ctra, San Vicente del Raspeig, S/N 03690 Alicante, Spain. mj.nueda@ua.es

ABSTRACT

Motivation: Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.

Methods: We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.

Results: Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

Show MeSH
Related in: MedlinePlus