Limits...
CMGSDB: integrating heterogeneous Caenorhabditis elegans data sources using compositional data mining.

Pati A, Jin Y, Klage K, Helm RF, Heath LS, Ramakrishnan N - Nucleic Acids Res. (2007)

Bottom Line: Besides gene, protein and functional annotations, CMGSDB currently unifies information about 531 RNAi phenotypes obtained from heterogeneous databases using a hierarchical scheme.A phenotype browser at the CMGSDB website serves this hierarchy and relates phenotypes to other biological entities.Chains can, for example, relate the knock down of a set of genes during an RNAi experiment to the disruption of a pathway or specific gene expression through another set of genes not directly related to the former set.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science and Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061, USA.

ABSTRACT
CMGSDB (Database for Computational Modeling of Gene Silencing) is an integration of heterogeneous data sources about Caenorhabditis elegans with capabilities for compositional data mining (CDM) across diverse domains. Besides gene, protein and functional annotations, CMGSDB currently unifies information about 531 RNAi phenotypes obtained from heterogeneous databases using a hierarchical scheme. A phenotype browser at the CMGSDB website serves this hierarchy and relates phenotypes to other biological entities. The application of CDM to CMGSDB produces 'chains' of relationships in the data by finding two-way connections between sets of biological entities. Chains can, for example, relate the knock down of a set of genes during an RNAi experiment to the disruption of a pathway or specific gene expression through another set of genes not directly related to the former set. The web interface for CMGSDB is available at https://bioinformatics.cs.vt.edu/cmgs/CMGSDB/, and serves individual biological entity information as well as details of all chains computed by CDM.

Show MeSH

Related in: MedlinePlus

Finding TFs whose knockdown induces improved desiccation tolerance in C. elegans. Two biclusters (shaded rectangles) joined at the gene interface using a redescription between their projections. Below that is the CDM schema, displaying the sequence of primitives.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2238953&req=5

Figure 1: Finding TFs whose knockdown induces improved desiccation tolerance in C. elegans. Two biclusters (shaded rectangles) joined at the gene interface using a redescription between their projections. Below that is the CDM schema, displaying the sequence of primitives.

Mentions: As illustrated in Figure 1, we mine biclusters between genes and the TFs that regulate them, mine biclusters between genes and the phenotypes that result when they are knocked down, and relate one side of the first bicluster with one side of the second bicluster. Hence the task of integrating diverse data sources is reduced to composing data-mining patterns computed over each of the sources separately. The advantage of this formulation is that each data source can be mined individually using a biclustering algorithm that is suited for that purpose. For instance, the xMotif (4), SAMBA (5) and ISA (6) algorithms are suited for mining numeric data (e.g. such as gene expression relationships), while a priori (7) and CHARM (8) algorithms are suited for mining Boolean data (e.g. graph adjacencies).Figure 1.


CMGSDB: integrating heterogeneous Caenorhabditis elegans data sources using compositional data mining.

Pati A, Jin Y, Klage K, Helm RF, Heath LS, Ramakrishnan N - Nucleic Acids Res. (2007)

Finding TFs whose knockdown induces improved desiccation tolerance in C. elegans. Two biclusters (shaded rectangles) joined at the gene interface using a redescription between their projections. Below that is the CDM schema, displaying the sequence of primitives.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2238953&req=5

Figure 1: Finding TFs whose knockdown induces improved desiccation tolerance in C. elegans. Two biclusters (shaded rectangles) joined at the gene interface using a redescription between their projections. Below that is the CDM schema, displaying the sequence of primitives.
Mentions: As illustrated in Figure 1, we mine biclusters between genes and the TFs that regulate them, mine biclusters between genes and the phenotypes that result when they are knocked down, and relate one side of the first bicluster with one side of the second bicluster. Hence the task of integrating diverse data sources is reduced to composing data-mining patterns computed over each of the sources separately. The advantage of this formulation is that each data source can be mined individually using a biclustering algorithm that is suited for that purpose. For instance, the xMotif (4), SAMBA (5) and ISA (6) algorithms are suited for mining numeric data (e.g. such as gene expression relationships), while a priori (7) and CHARM (8) algorithms are suited for mining Boolean data (e.g. graph adjacencies).Figure 1.

Bottom Line: Besides gene, protein and functional annotations, CMGSDB currently unifies information about 531 RNAi phenotypes obtained from heterogeneous databases using a hierarchical scheme.A phenotype browser at the CMGSDB website serves this hierarchy and relates phenotypes to other biological entities.Chains can, for example, relate the knock down of a set of genes during an RNAi experiment to the disruption of a pathway or specific gene expression through another set of genes not directly related to the former set.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science and Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061, USA.

ABSTRACT
CMGSDB (Database for Computational Modeling of Gene Silencing) is an integration of heterogeneous data sources about Caenorhabditis elegans with capabilities for compositional data mining (CDM) across diverse domains. Besides gene, protein and functional annotations, CMGSDB currently unifies information about 531 RNAi phenotypes obtained from heterogeneous databases using a hierarchical scheme. A phenotype browser at the CMGSDB website serves this hierarchy and relates phenotypes to other biological entities. The application of CDM to CMGSDB produces 'chains' of relationships in the data by finding two-way connections between sets of biological entities. Chains can, for example, relate the knock down of a set of genes during an RNAi experiment to the disruption of a pathway or specific gene expression through another set of genes not directly related to the former set. The web interface for CMGSDB is available at https://bioinformatics.cs.vt.edu/cmgs/CMGSDB/, and serves individual biological entity information as well as details of all chains computed by CDM.

Show MeSH
Related in: MedlinePlus