Limits...
Listen to genes: dealing with microarray data in the frequency domain.

Feng J, Yi D, Krishna R, Guo S, Buchanan-Wollaston V - PLoS ONE (2009)

Bottom Line: The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days.We show our method in a step by step manner with help of toy models as well as a real biological dataset.We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.

View Article: PubMed Central - PubMed

Affiliation: Centre for Computational System Biology, Shanghai, Fudan University, Shanghai, People's Republic of China. Jianfeng.Feng@warwick.ac.uk

ABSTRACT

Background: We present a novel and systematic approach to analyze temporal microarray data. The approach includes normalization, clustering and network analysis of genes.

Methodology: Genes are normalized using an error model based uniform normalization method aimed at identifying and estimating the sources of variations. The model minimizes the correlation among error terms across replicates. The normalized gene expressions are then clustered in terms of their power spectrum density. The method of complex Granger causality is introduced to reveal interactions between sets of genes. Complex Granger causality along with partial Granger causality is applied in both time and frequency domains to selected as well as all the genes to reveal the interesting networks of interactions. The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days. Three circuits: a circadian gene circuit, an ethylene circuit and a new global circuit showing a hierarchical structure to determine the initiators of leaf senescence are analyzed in detail.

Conclusions: We use a totally data-driven approach to form biological hypothesis. Clustering using the power-spectrum analysis helps us identify genes of potential interest. Their dynamics can be captured accurately in the time and frequency domain using the methods of complex and partial Granger causality. With the rise in availability of temporal microarray data, such methods can be useful tools in uncovering the hidden biological interactions. We show our method in a step by step manner with help of toy models as well as a real biological dataset. We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.

Show MeSH

Related in: MedlinePlus

One gene circuit controlling circadian activity.A. Time trace of four genes, ELF4, TOC1, LFY and CCA1. ELF4 and TOC1 are in-phase oscillators, LFY and CCA1 are in-phase oscillators, but they are off-phase oscillators with respect to ELF4 and TOC1. B. Magnitudes vs. frequency for the four genes. They have highest magnitude at the frequency of one-day period. C. The gene circuit obtained in terms of PGC (see annotation in Text S2). D. Complex interactions between different group of genes and GI. E. Gene interactions in the frequency domain. The y-axis represents the strength of causal interactions.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3383793&req=5

pone-0005098-g004: One gene circuit controlling circadian activity.A. Time trace of four genes, ELF4, TOC1, LFY and CCA1. ELF4 and TOC1 are in-phase oscillators, LFY and CCA1 are in-phase oscillators, but they are off-phase oscillators with respect to ELF4 and TOC1. B. Magnitudes vs. frequency for the four genes. They have highest magnitude at the frequency of one-day period. C. The gene circuit obtained in terms of PGC (see annotation in Text S2). D. Complex interactions between different group of genes and GI. E. Gene interactions in the frequency domain. The y-axis represents the strength of causal interactions.

Mentions: In Figure 4A, the top most gene ELF4 shows a strong circadian rhythm. Actually it has the biggest M11 value. The importance of ELF4 in regulating the circadian activity is also reported in the literature [19], [20]. From the gene annotation (also presented in Text S2), we found that ELF4 is related to two other genes: LHY and CCA1. ELF4 is necessary for light-induced expression of both CCA1 and LHY. Figure 4A plots the time trace of these genes. A circadian circuit related to LHY and CCA1 has been reported in the literature [21], [22]. The circuit comprises of three loops; PRR9, PRR7 and LHY/CCA1 in one loop (morning loop or loop III), TOC1 and GI as another loop (night loop or loop II), and a loop of LHY/CCA1, TOC1 and an unknown gene as loop I.


Listen to genes: dealing with microarray data in the frequency domain.

Feng J, Yi D, Krishna R, Guo S, Buchanan-Wollaston V - PLoS ONE (2009)

One gene circuit controlling circadian activity.A. Time trace of four genes, ELF4, TOC1, LFY and CCA1. ELF4 and TOC1 are in-phase oscillators, LFY and CCA1 are in-phase oscillators, but they are off-phase oscillators with respect to ELF4 and TOC1. B. Magnitudes vs. frequency for the four genes. They have highest magnitude at the frequency of one-day period. C. The gene circuit obtained in terms of PGC (see annotation in Text S2). D. Complex interactions between different group of genes and GI. E. Gene interactions in the frequency domain. The y-axis represents the strength of causal interactions.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3383793&req=5

pone-0005098-g004: One gene circuit controlling circadian activity.A. Time trace of four genes, ELF4, TOC1, LFY and CCA1. ELF4 and TOC1 are in-phase oscillators, LFY and CCA1 are in-phase oscillators, but they are off-phase oscillators with respect to ELF4 and TOC1. B. Magnitudes vs. frequency for the four genes. They have highest magnitude at the frequency of one-day period. C. The gene circuit obtained in terms of PGC (see annotation in Text S2). D. Complex interactions between different group of genes and GI. E. Gene interactions in the frequency domain. The y-axis represents the strength of causal interactions.
Mentions: In Figure 4A, the top most gene ELF4 shows a strong circadian rhythm. Actually it has the biggest M11 value. The importance of ELF4 in regulating the circadian activity is also reported in the literature [19], [20]. From the gene annotation (also presented in Text S2), we found that ELF4 is related to two other genes: LHY and CCA1. ELF4 is necessary for light-induced expression of both CCA1 and LHY. Figure 4A plots the time trace of these genes. A circadian circuit related to LHY and CCA1 has been reported in the literature [21], [22]. The circuit comprises of three loops; PRR9, PRR7 and LHY/CCA1 in one loop (morning loop or loop III), TOC1 and GI as another loop (night loop or loop II), and a loop of LHY/CCA1, TOC1 and an unknown gene as loop I.

Bottom Line: The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days.We show our method in a step by step manner with help of toy models as well as a real biological dataset.We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.

View Article: PubMed Central - PubMed

Affiliation: Centre for Computational System Biology, Shanghai, Fudan University, Shanghai, People's Republic of China. Jianfeng.Feng@warwick.ac.uk

ABSTRACT

Background: We present a novel and systematic approach to analyze temporal microarray data. The approach includes normalization, clustering and network analysis of genes.

Methodology: Genes are normalized using an error model based uniform normalization method aimed at identifying and estimating the sources of variations. The model minimizes the correlation among error terms across replicates. The normalized gene expressions are then clustered in terms of their power spectrum density. The method of complex Granger causality is introduced to reveal interactions between sets of genes. Complex Granger causality along with partial Granger causality is applied in both time and frequency domains to selected as well as all the genes to reveal the interesting networks of interactions. The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days. Three circuits: a circadian gene circuit, an ethylene circuit and a new global circuit showing a hierarchical structure to determine the initiators of leaf senescence are analyzed in detail.

Conclusions: We use a totally data-driven approach to form biological hypothesis. Clustering using the power-spectrum analysis helps us identify genes of potential interest. Their dynamics can be captured accurately in the time and frequency domain using the methods of complex and partial Granger causality. With the rise in availability of temporal microarray data, such methods can be useful tools in uncovering the hidden biological interactions. We show our method in a step by step manner with help of toy models as well as a real biological dataset. We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.

Show MeSH
Related in: MedlinePlus