Limits...
Identification of temporal association rules from time-series microarray data sets.

Nam H, Lee K, Lee D - BMC Bioinformatics (2009)

Bottom Line: From the extracted temporal association rules, associated genes, which play same role of biological processes within short transcriptional time delay and some temporal dependencies between genes with specific biological processes are identified.TARM showed higher precision score than Dynamic Bayesian network and Bayesian network.Advantages of TARM are that it tells us the size of transcriptional time delay between associated genes, activation and inhibition relationship between genes, and sets of co-regulators.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bio and Brain Engineering, KAIST, 373-1 Guseong-dong, Yuseong-gu, Daejeon, Korea. hjnam@kaist.ac.kr

ABSTRACT

Background: One of the most challenging problems in mining gene expression data is to identify how the expression of any particular gene affects the expression of other genes. To elucidate the relationships between genes, an association rule mining (ARM) method has been applied to microarray gene expression data. However, a conventional ARM method has a limit on extracting temporal dependencies between gene expressions, though the temporal information is indispensable to discover underlying regulation mechanisms in biological pathways. In this paper, we propose a novel method, referred to as temporal association rule mining (TARM), which can extract temporal dependencies among related genes. A temporal association rule has the form [gene A upward arrow, gene B downward arrow] --> (7 min) [gene C upward arrow], which represents that high expression level of gene A and significant repression of gene B followed by significant expression of gene C after 7 minutes. The proposed TARM method is tested with Saccharomyces cerevisiae cell cycle time-series microarray gene expression data set.

Results: In the parameter fitting phase of TARM, the fitted parameter set [threshold = +/- 0.8, support >or= 3 transactions, confidence >or= 90%] with the best precision score for KEGG cell cycle pathway has been chosen for rule mining phase. With the fitted parameter set, numbers of temporal association rules with five transcriptional time delays (0, 7, 14, 21, 28 minutes) are extracted from gene expression data of 799 genes, which are pre-identified cell cycle relevant genes. From the extracted temporal association rules, associated genes, which play same role of biological processes within short transcriptional time delay and some temporal dependencies between genes with specific biological processes are identified.

Conclusion: In this work, we proposed TARM, which is an applied form of conventional ARM. TARM showed higher precision score than Dynamic Bayesian network and Bayesian network. Advantages of TARM are that it tells us the size of transcriptional time delay between associated genes, activation and inhibition relationship between genes, and sets of co-regulators.

Show MeSH
The number of extracted temporal association rules from cell cycle data set and random data set. The graph shows the number of extracted temporal association rules in five transcriptional time delays (0, 7, 14, 21, 28 minutes) from time-series gene expression of 799 cell cycle relevant genes and random shuffled cell cycle data set [threshold = ± 0.8, support ≥ 3 transactions, confidence ≥ 90%]. Black bar indicates the number of extracted rules in real data set and gray bar stands for the average number of extracted rules of 100 times of random tests.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2665054&req=5

Figure 4: The number of extracted temporal association rules from cell cycle data set and random data set. The graph shows the number of extracted temporal association rules in five transcriptional time delays (0, 7, 14, 21, 28 minutes) from time-series gene expression of 799 cell cycle relevant genes and random shuffled cell cycle data set [threshold = ± 0.8, support ≥ 3 transactions, confidence ≥ 90%]. Black bar indicates the number of extracted rules in real data set and gray bar stands for the average number of extracted rules of 100 times of random tests.

Mentions: Using the selected parameter set, we applied TARM method to 799 genes which are pre-identified as cell cycle relevant genes in [29] and extracted numbers of temporal association rules with various sizes of transcriptional time delay. To test the significance of the temporal association rules, TARM is also applied to random shuffled cell cycle expression data of 799 genes. Figure 4 is the comparison result of both the real cell cycle data set and the shuffled cell cycle data set. As the Figure shows, the extracted numbers of rules from real cell cycle data set and random data set are comparably different. The results indicate that temporal association rules extracted by our proposed method are more significant than random rules.


Identification of temporal association rules from time-series microarray data sets.

Nam H, Lee K, Lee D - BMC Bioinformatics (2009)

The number of extracted temporal association rules from cell cycle data set and random data set. The graph shows the number of extracted temporal association rules in five transcriptional time delays (0, 7, 14, 21, 28 minutes) from time-series gene expression of 799 cell cycle relevant genes and random shuffled cell cycle data set [threshold = ± 0.8, support ≥ 3 transactions, confidence ≥ 90%]. Black bar indicates the number of extracted rules in real data set and gray bar stands for the average number of extracted rules of 100 times of random tests.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2665054&req=5

Figure 4: The number of extracted temporal association rules from cell cycle data set and random data set. The graph shows the number of extracted temporal association rules in five transcriptional time delays (0, 7, 14, 21, 28 minutes) from time-series gene expression of 799 cell cycle relevant genes and random shuffled cell cycle data set [threshold = ± 0.8, support ≥ 3 transactions, confidence ≥ 90%]. Black bar indicates the number of extracted rules in real data set and gray bar stands for the average number of extracted rules of 100 times of random tests.
Mentions: Using the selected parameter set, we applied TARM method to 799 genes which are pre-identified as cell cycle relevant genes in [29] and extracted numbers of temporal association rules with various sizes of transcriptional time delay. To test the significance of the temporal association rules, TARM is also applied to random shuffled cell cycle expression data of 799 genes. Figure 4 is the comparison result of both the real cell cycle data set and the shuffled cell cycle data set. As the Figure shows, the extracted numbers of rules from real cell cycle data set and random data set are comparably different. The results indicate that temporal association rules extracted by our proposed method are more significant than random rules.

Bottom Line: From the extracted temporal association rules, associated genes, which play same role of biological processes within short transcriptional time delay and some temporal dependencies between genes with specific biological processes are identified.TARM showed higher precision score than Dynamic Bayesian network and Bayesian network.Advantages of TARM are that it tells us the size of transcriptional time delay between associated genes, activation and inhibition relationship between genes, and sets of co-regulators.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bio and Brain Engineering, KAIST, 373-1 Guseong-dong, Yuseong-gu, Daejeon, Korea. hjnam@kaist.ac.kr

ABSTRACT

Background: One of the most challenging problems in mining gene expression data is to identify how the expression of any particular gene affects the expression of other genes. To elucidate the relationships between genes, an association rule mining (ARM) method has been applied to microarray gene expression data. However, a conventional ARM method has a limit on extracting temporal dependencies between gene expressions, though the temporal information is indispensable to discover underlying regulation mechanisms in biological pathways. In this paper, we propose a novel method, referred to as temporal association rule mining (TARM), which can extract temporal dependencies among related genes. A temporal association rule has the form [gene A upward arrow, gene B downward arrow] --> (7 min) [gene C upward arrow], which represents that high expression level of gene A and significant repression of gene B followed by significant expression of gene C after 7 minutes. The proposed TARM method is tested with Saccharomyces cerevisiae cell cycle time-series microarray gene expression data set.

Results: In the parameter fitting phase of TARM, the fitted parameter set [threshold = +/- 0.8, support >or= 3 transactions, confidence >or= 90%] with the best precision score for KEGG cell cycle pathway has been chosen for rule mining phase. With the fitted parameter set, numbers of temporal association rules with five transcriptional time delays (0, 7, 14, 21, 28 minutes) are extracted from gene expression data of 799 genes, which are pre-identified cell cycle relevant genes. From the extracted temporal association rules, associated genes, which play same role of biological processes within short transcriptional time delay and some temporal dependencies between genes with specific biological processes are identified.

Conclusion: In this work, we proposed TARM, which is an applied form of conventional ARM. TARM showed higher precision score than Dynamic Bayesian network and Bayesian network. Advantages of TARM are that it tells us the size of transcriptional time delay between associated genes, activation and inhibition relationship between genes, and sets of co-regulators.

Show MeSH