Limits...
Fuzzy association rules for biological data analysis: a case study on yeast.

Lopez FJ, Blanco A, Garcia F, Cano C, Marin A - BMC Bioinformatics (2008)

Bottom Line: A number of association rules have been found, many of them agreeing with previous research in the area.In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and AI, University of Granada, 18071, Granada, Spain. fjavier@decsai.ugr.es

ABSTRACT

Background: Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data.

Results: In this work we propose a novel fuzzy methodology based on a fuzzy association rule mining method for biological knowledge extraction. We apply this methodology over a yeast genome dataset containing heterogeneous information regarding structural and functional genome features. A number of association rules have been found, many of them agreeing with previous research in the area. In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.

Conclusion: An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases. It is shown that fuzzy association rules can model this knowledge in an intuitive way by using linguistic labels and few easy-understandable parameters.

Show MeSH
Biclusters 5 & 6. This figure shows the gene expression pattern represented by biclusters 5 (A) and 6 (B).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2277399&req=5

Figure 4: Biclusters 5 & 6. This figure shows the gene expression pattern represented by biclusters 5 (A) and 6 (B).

Mentions: The last 5 rules involve gene expression patterns obtained from the dataset by Gasch et al. Bicluster 5 was obtained by the EDA biclustering algorithm and bicluster 6 by the Gene & Sample Shaving algorithm (see Figures 4A &4B). This dataset is formed by a broad variety of experiments, therefore the obtained biclusters contain columns from very different experiments. For example, bicluster 5 contains columns from 9 different experiment sets. It depicts the gene expression profile of 74 genes under 15 experimental conditions. Genes belonging to this bicluster are large genes which tend to have MEDIUM responsiveness. The last three rules involve bicluster 6. This bicluster is specially interesting since it contains many columns (51) and it presents a very clear expression profile. The associations found describe these genes as belonging to chromosome II and being annotated in the terms macromolecule biosynthesis and cytosol.


Fuzzy association rules for biological data analysis: a case study on yeast.

Lopez FJ, Blanco A, Garcia F, Cano C, Marin A - BMC Bioinformatics (2008)

Biclusters 5 & 6. This figure shows the gene expression pattern represented by biclusters 5 (A) and 6 (B).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2277399&req=5

Figure 4: Biclusters 5 & 6. This figure shows the gene expression pattern represented by biclusters 5 (A) and 6 (B).
Mentions: The last 5 rules involve gene expression patterns obtained from the dataset by Gasch et al. Bicluster 5 was obtained by the EDA biclustering algorithm and bicluster 6 by the Gene & Sample Shaving algorithm (see Figures 4A &4B). This dataset is formed by a broad variety of experiments, therefore the obtained biclusters contain columns from very different experiments. For example, bicluster 5 contains columns from 9 different experiment sets. It depicts the gene expression profile of 74 genes under 15 experimental conditions. Genes belonging to this bicluster are large genes which tend to have MEDIUM responsiveness. The last three rules involve bicluster 6. This bicluster is specially interesting since it contains many columns (51) and it presents a very clear expression profile. The associations found describe these genes as belonging to chromosome II and being annotated in the terms macromolecule biosynthesis and cytosol.

Bottom Line: A number of association rules have been found, many of them agreeing with previous research in the area.In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and AI, University of Granada, 18071, Granada, Spain. fjavier@decsai.ugr.es

ABSTRACT

Background: Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data.

Results: In this work we propose a novel fuzzy methodology based on a fuzzy association rule mining method for biological knowledge extraction. We apply this methodology over a yeast genome dataset containing heterogeneous information regarding structural and functional genome features. A number of association rules have been found, many of them agreeing with previous research in the area. In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.

Conclusion: An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases. It is shown that fuzzy association rules can model this knowledge in an intuitive way by using linguistic labels and few easy-understandable parameters.

Show MeSH