Limits...
Paradigm of tunable clustering using Binarization of Consensus Partition Matrices (Bi-CoPaM) for gene discovery.

Abu-Jamous B, Fa R, Roberts DJ, Nandi AK - PLoS ONE (2013)

Bottom Line: Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters.The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets.The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Electrical Engineering and Electronics, The University of Liverpool, Brownlow Hill, Liverpool, United Kingdom.

ABSTRACT
Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

Show MeSH

Related in: MedlinePlus

YAL040C / CLN3 gene expression profiles in the five microarray datasets.Genetic expression profiles for the cyclin CLN3 from the five datasets cdc28, cdc15, alpha, alpha-30 and alpha-38 are plotted. Although the gene’ is known to be expressed periodically, different levels of periodicity for its profiles can be seen for different datasets clearly.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3569426&req=5

pone-0056432-g006: YAL040C / CLN3 gene expression profiles in the five microarray datasets.Genetic expression profiles for the cyclin CLN3 from the five datasets cdc28, cdc15, alpha, alpha-30 and alpha-38 are plotted. Although the gene’ is known to be expressed periodically, different levels of periodicity for its profiles can be seen for different datasets clearly.

Mentions: Figure 6 shows the normalised gene expression profiles of this gene from the five yeast datasets considered in this study. The profiles in the datasets cdc15, alpha and alpha-30 are just as expected, in that each of them shows two high-expression regions at the G1 stage from the two cell-cycles. The expression profile in cdc28 is fine in that it shows a very obvious peak at the G1 stage from the second cycle. The profile in alpha-38 is far from what is expected because the expression at the second time point (at 5 minutes) has, for some reason, a large positive impulse which flattens the normalised expression profile at other time points.


Paradigm of tunable clustering using Binarization of Consensus Partition Matrices (Bi-CoPaM) for gene discovery.

Abu-Jamous B, Fa R, Roberts DJ, Nandi AK - PLoS ONE (2013)

YAL040C / CLN3 gene expression profiles in the five microarray datasets.Genetic expression profiles for the cyclin CLN3 from the five datasets cdc28, cdc15, alpha, alpha-30 and alpha-38 are plotted. Although the gene’ is known to be expressed periodically, different levels of periodicity for its profiles can be seen for different datasets clearly.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3569426&req=5

pone-0056432-g006: YAL040C / CLN3 gene expression profiles in the five microarray datasets.Genetic expression profiles for the cyclin CLN3 from the five datasets cdc28, cdc15, alpha, alpha-30 and alpha-38 are plotted. Although the gene’ is known to be expressed periodically, different levels of periodicity for its profiles can be seen for different datasets clearly.
Mentions: Figure 6 shows the normalised gene expression profiles of this gene from the five yeast datasets considered in this study. The profiles in the datasets cdc15, alpha and alpha-30 are just as expected, in that each of them shows two high-expression regions at the G1 stage from the two cell-cycles. The expression profile in cdc28 is fine in that it shows a very obvious peak at the G1 stage from the second cycle. The profile in alpha-38 is far from what is expected because the expression at the second time point (at 5 minutes) has, for some reason, a large positive impulse which flattens the normalised expression profile at other time points.

Bottom Line: Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters.The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets.The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Electrical Engineering and Electronics, The University of Liverpool, Brownlow Hill, Liverpool, United Kingdom.

ABSTRACT
Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

Show MeSH
Related in: MedlinePlus