Limits...
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.

Feng C, Araki M, Kunimoto R, Tamon A, Makiguchi H, Niijima S, Tsujimoto G, Okuno Y - BMC Genomics (2009)

Bottom Line: The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO.GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data.GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Systems Bioscience for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, 46-29 Yoshidashimoadachi-cho, Sakyo-ku, Kyoto 606-8501, Japan.

ABSTRACT

Background: DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database.

Results: GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories.

Conclusion: GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.

Show MeSH

Related in: MedlinePlus

Screenshot of GEM-TREND. a) Query input area. The gene-expression signature, gene expression ratio data and text are accepted. Network IDs can be used to retrieve previous networks. b) Results area. The search results of GEO series ID (GSE ID), GEO platform ID (GPL ID), series title, similarity score, and p-value are displayed. One record corresponds to one GEO series and links to GEO by GSE ID and GPL ID. The previous results can be retrieved by JOB IDs. c) Network visualization (Gene Cluster tab): c-1) Network graphical display area. Genes (nodes) in red background are genes from query, while the genes in the yellow background are those that are user-selected. The number shown in the top-right of the genes describes the number of hidden linkages. These linkages can be expanded or hidden by a right click on the gene of interest to choose from the pop-up menu. Genes link to the UniGene database by double clicking. c-2) Gene cluster area, whereupon gene clusters are shown. The number following the cluster describes the number of member genes in the cluster. Genes link to the UniGene database by clicking the UniGene icon. c-3) Gene search window. Matched genes will be highlighted in the gene cluster area. d) Network visualization (GO tab): d-1) Network graphical display area. Genes in the orange background are those associated with the common GO term. d-2) Gene annotation. The top three significant shared GO terms of genes in each ontology are shown for each cluster. The number following the term describes the number of genes associated with the term. Terms link to GO by clicking the GO icon. d-3) Gene search window. e) Linkout to GEO database. f) Linkout to Unigene database. g) Linkout to Gene Ontology database.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2748096&req=5

Figure 3: Screenshot of GEM-TREND. a) Query input area. The gene-expression signature, gene expression ratio data and text are accepted. Network IDs can be used to retrieve previous networks. b) Results area. The search results of GEO series ID (GSE ID), GEO platform ID (GPL ID), series title, similarity score, and p-value are displayed. One record corresponds to one GEO series and links to GEO by GSE ID and GPL ID. The previous results can be retrieved by JOB IDs. c) Network visualization (Gene Cluster tab): c-1) Network graphical display area. Genes (nodes) in red background are genes from query, while the genes in the yellow background are those that are user-selected. The number shown in the top-right of the genes describes the number of hidden linkages. These linkages can be expanded or hidden by a right click on the gene of interest to choose from the pop-up menu. Genes link to the UniGene database by double clicking. c-2) Gene cluster area, whereupon gene clusters are shown. The number following the cluster describes the number of member genes in the cluster. Genes link to the UniGene database by clicking the UniGene icon. c-3) Gene search window. Matched genes will be highlighted in the gene cluster area. d) Network visualization (GO tab): d-1) Network graphical display area. Genes in the orange background are those associated with the common GO term. d-2) Gene annotation. The top three significant shared GO terms of genes in each ontology are shown for each cluster. The number following the term describes the number of genes associated with the term. Terms link to GO by clicking the GO icon. d-3) Gene search window. e) Linkout to GEO database. f) Linkout to Unigene database. g) Linkout to Gene Ontology database.

Mentions: GEM-TREND is designed to be user-friendly. Only a few simple steps are required to search GEO gene expression data and visualize the network. The main page of GEO gene expression data search comprises a query input area (Fig. 3a), and a results area (Fig. 3b). For a GEO gene expression data search, both gene-expression pattern-based searches (either gene-expression signatures or gene expression ratio data as inputs) and text-based searches (accepting keywords, platform IDs, or series IDs as inputs) are available, but similarity scores and p-values are calculated only for gene-expression pattern-based searches. To further analyze retrieved data (e.g. network analysis), GEM-TREND provides the GEO series that links together a group of related samples instead of providing reference gene expression profiles. The results consist of GEO series ID (GSE ID), GEO platform ID (GPL ID), series title, similarity score, and p-value displayed in the results area. Here, the similarity score of the GEO series is the maximum similarity score among samples in the same GEO series. The full series title can be displayed as a tool-tip when the mouse is over the title, and each series links to GEO by clicking the GSE ID or GPL ID (Fig. 3e). In addition, the series of interest can be selected for further processing. Both search results and selected series can be downloaded in CSV format.


GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.

Feng C, Araki M, Kunimoto R, Tamon A, Makiguchi H, Niijima S, Tsujimoto G, Okuno Y - BMC Genomics (2009)

Screenshot of GEM-TREND. a) Query input area. The gene-expression signature, gene expression ratio data and text are accepted. Network IDs can be used to retrieve previous networks. b) Results area. The search results of GEO series ID (GSE ID), GEO platform ID (GPL ID), series title, similarity score, and p-value are displayed. One record corresponds to one GEO series and links to GEO by GSE ID and GPL ID. The previous results can be retrieved by JOB IDs. c) Network visualization (Gene Cluster tab): c-1) Network graphical display area. Genes (nodes) in red background are genes from query, while the genes in the yellow background are those that are user-selected. The number shown in the top-right of the genes describes the number of hidden linkages. These linkages can be expanded or hidden by a right click on the gene of interest to choose from the pop-up menu. Genes link to the UniGene database by double clicking. c-2) Gene cluster area, whereupon gene clusters are shown. The number following the cluster describes the number of member genes in the cluster. Genes link to the UniGene database by clicking the UniGene icon. c-3) Gene search window. Matched genes will be highlighted in the gene cluster area. d) Network visualization (GO tab): d-1) Network graphical display area. Genes in the orange background are those associated with the common GO term. d-2) Gene annotation. The top three significant shared GO terms of genes in each ontology are shown for each cluster. The number following the term describes the number of genes associated with the term. Terms link to GO by clicking the GO icon. d-3) Gene search window. e) Linkout to GEO database. f) Linkout to Unigene database. g) Linkout to Gene Ontology database.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2748096&req=5

Figure 3: Screenshot of GEM-TREND. a) Query input area. The gene-expression signature, gene expression ratio data and text are accepted. Network IDs can be used to retrieve previous networks. b) Results area. The search results of GEO series ID (GSE ID), GEO platform ID (GPL ID), series title, similarity score, and p-value are displayed. One record corresponds to one GEO series and links to GEO by GSE ID and GPL ID. The previous results can be retrieved by JOB IDs. c) Network visualization (Gene Cluster tab): c-1) Network graphical display area. Genes (nodes) in red background are genes from query, while the genes in the yellow background are those that are user-selected. The number shown in the top-right of the genes describes the number of hidden linkages. These linkages can be expanded or hidden by a right click on the gene of interest to choose from the pop-up menu. Genes link to the UniGene database by double clicking. c-2) Gene cluster area, whereupon gene clusters are shown. The number following the cluster describes the number of member genes in the cluster. Genes link to the UniGene database by clicking the UniGene icon. c-3) Gene search window. Matched genes will be highlighted in the gene cluster area. d) Network visualization (GO tab): d-1) Network graphical display area. Genes in the orange background are those associated with the common GO term. d-2) Gene annotation. The top three significant shared GO terms of genes in each ontology are shown for each cluster. The number following the term describes the number of genes associated with the term. Terms link to GO by clicking the GO icon. d-3) Gene search window. e) Linkout to GEO database. f) Linkout to Unigene database. g) Linkout to Gene Ontology database.
Mentions: GEM-TREND is designed to be user-friendly. Only a few simple steps are required to search GEO gene expression data and visualize the network. The main page of GEO gene expression data search comprises a query input area (Fig. 3a), and a results area (Fig. 3b). For a GEO gene expression data search, both gene-expression pattern-based searches (either gene-expression signatures or gene expression ratio data as inputs) and text-based searches (accepting keywords, platform IDs, or series IDs as inputs) are available, but similarity scores and p-values are calculated only for gene-expression pattern-based searches. To further analyze retrieved data (e.g. network analysis), GEM-TREND provides the GEO series that links together a group of related samples instead of providing reference gene expression profiles. The results consist of GEO series ID (GSE ID), GEO platform ID (GPL ID), series title, similarity score, and p-value displayed in the results area. Here, the similarity score of the GEO series is the maximum similarity score among samples in the same GEO series. The full series title can be displayed as a tool-tip when the mouse is over the title, and each series links to GEO by clicking the GSE ID or GPL ID (Fig. 3e). In addition, the series of interest can be selected for further processing. Both search results and selected series can be downloaded in CSV format.

Bottom Line: The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO.GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data.GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Systems Bioscience for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, 46-29 Yoshidashimoadachi-cho, Sakyo-ku, Kyoto 606-8501, Japan.

ABSTRACT

Background: DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database.

Results: GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories.

Conclusion: GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.

Show MeSH
Related in: MedlinePlus