Limits...
Clustering-based approaches to SAGE data mining.

Wang H, Zheng H, Azuaje F - BioData Min (2008)

Bottom Line: Clustering techniques have become fundamental approaches in these applications.This paper reviews relevant clustering techniques specifically designed for this type of data.It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Computing and Mathematics, University of Ulster, Newtownabbey, BT37 0QB, Co, Antrim, Northern Ireland, UK. hy.wang@ulster.ac.uk

ABSTRACT
Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.

No MeSH data available.


Related in: MedlinePlus

An illustration of two-way hierarchical clustering analysis of 1118 SAGE tags highly expressed in the mouse microdissected outer nuclear layer (ONL) published by Blackshaw et al. [39]. Each row represents a SAGE tag, where each columns correspond to a SAGE library. A total of murine 14 libraries were considered including different tissues and developmental stages, including mouse NIH-3T3 fibroblast cells, adult hypothalamus, developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and microdissected outer nuclear layer (ONL). developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and ONL.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2553774&req=5

Figure 2: An illustration of two-way hierarchical clustering analysis of 1118 SAGE tags highly expressed in the mouse microdissected outer nuclear layer (ONL) published by Blackshaw et al. [39]. Each row represents a SAGE tag, where each columns correspond to a SAGE library. A total of murine 14 libraries were considered including different tissues and developmental stages, including mouse NIH-3T3 fibroblast cells, adult hypothalamus, developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and microdissected outer nuclear layer (ONL). developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and ONL.

Mentions: The dendrogram is a graphical representation of hierarchical clustering, in which each step of the clustering process is illustrated by a tree joint, and each tree node represents a subset of expression data provides. It provides an intuitive platform for biologists to visualize basic relationships between all the tags or libraries, as illustrated in Figure 2. This example shows a two-way hierarchical clustering of 1118 SAGE tags highly expressed in the mouse microdissected outer nuclear layer (ONL) published by Blackshaw et al. [39]. However, such a representation does not directly produce explicit partitions of the data. Given the sheer number of the data possibly involved in the analysis of SAGE studies, it is usually not obvious how to define clusters from the tree. For example, it could be a complex task for users to determine the optimal number of clusters and obtain meaningful partitions solely based on the dendrogram shown in Figure 2.


Clustering-based approaches to SAGE data mining.

Wang H, Zheng H, Azuaje F - BioData Min (2008)

An illustration of two-way hierarchical clustering analysis of 1118 SAGE tags highly expressed in the mouse microdissected outer nuclear layer (ONL) published by Blackshaw et al. [39]. Each row represents a SAGE tag, where each columns correspond to a SAGE library. A total of murine 14 libraries were considered including different tissues and developmental stages, including mouse NIH-3T3 fibroblast cells, adult hypothalamus, developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and microdissected outer nuclear layer (ONL). developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and ONL.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2553774&req=5

Figure 2: An illustration of two-way hierarchical clustering analysis of 1118 SAGE tags highly expressed in the mouse microdissected outer nuclear layer (ONL) published by Blackshaw et al. [39]. Each row represents a SAGE tag, where each columns correspond to a SAGE library. A total of murine 14 libraries were considered including different tissues and developmental stages, including mouse NIH-3T3 fibroblast cells, adult hypothalamus, developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and microdissected outer nuclear layer (ONL). developing retina at 2 day intervals from embryonic day (E) 12.5 to postnatal day (P) 6.5, P10.5 retinas from the paired-homeodomain gene crx knockout mouse (crx-/-) and from wild type (crx+/+) littermates, adult retina and ONL.
Mentions: The dendrogram is a graphical representation of hierarchical clustering, in which each step of the clustering process is illustrated by a tree joint, and each tree node represents a subset of expression data provides. It provides an intuitive platform for biologists to visualize basic relationships between all the tags or libraries, as illustrated in Figure 2. This example shows a two-way hierarchical clustering of 1118 SAGE tags highly expressed in the mouse microdissected outer nuclear layer (ONL) published by Blackshaw et al. [39]. However, such a representation does not directly produce explicit partitions of the data. Given the sheer number of the data possibly involved in the analysis of SAGE studies, it is usually not obvious how to define clusters from the tree. For example, it could be a complex task for users to determine the optimal number of clusters and obtain meaningful partitions solely based on the dendrogram shown in Figure 2.

Bottom Line: Clustering techniques have become fundamental approaches in these applications.This paper reviews relevant clustering techniques specifically designed for this type of data.It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Computing and Mathematics, University of Ulster, Newtownabbey, BT37 0QB, Co, Antrim, Northern Ireland, UK. hy.wang@ulster.ac.uk

ABSTRACT
Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.

No MeSH data available.


Related in: MedlinePlus