Limits...
Reducing the complexity of complex gene coexpression networks by coupling multiweighted labeling with topological analysis.

Benso A, Cornale P, Di Carlo S, Politano G, Savino A - Biomed Res Int (2013)

Bottom Line: In order to infer relevant information, the network must be properly filtered and its complexity reduced.This paper proposes an efficient multivariate filtering designed to analyze the topological properties of a coexpression network in order to identify potential relevant genes for a given disease.Results have been validated resorting to bibliographic data automatically mined using the ProteinQuest literature mining tool.

View Article: PubMed Central - PubMed

Affiliation: Department of Controls and Computer Engineering, Politecnico di Torino, 10129 Torino, Italy ; Consorzio Interuniversitario Nazionale per l'Informatica, 11029 Verres, Italy.

ABSTRACT
Undirected gene coexpression networks obtained from experimental expression data coupled with efficient computational procedures are increasingly used to identify potentially relevant biological information (e.g., biomarkers) for a particular disease. However, coexpression networks built from experimental expression data are in general large highly connected networks with an elevated number of false-positive interactions (nodes and edges). In order to infer relevant information, the network must be properly filtered and its complexity reduced. Given the complexity and the multivariate nature of the information contained in the network, this requires the development and application of efficient feature selection algorithms to be able to exploit the topological characteristics of the network to identify relevant nodes and edges. This paper proposes an efficient multivariate filtering designed to analyze the topological properties of a coexpression network in order to identify potential relevant genes for a given disease. The algorithm has been tested on three datasets for three well known and studied diseases: acute myeloid leukemia, breast cancer, and diffuse large B-cell lymphoma. Results have been validated resorting to bibliographic data automatically mined using the ProteinQuest literature mining tool.

Show MeSH

Related in: MedlinePlus

ProteinQuest query example to obtain citation data for the AML dataset.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3814072&req=5

alg1: ProteinQuest query example to obtain citation data for the AML dataset.

Mentions: In order to perform a more solid statistical validation through the use of bibliometric data, we executed a set of queries on ProteinQuest to understand if, given a disease, the set of genes selected by our algorithm is highly cocited with the disease while showing low citation count with the other diseases. As an example, Algorithm 1 shows the query executed to search for citation relevance of AML genes with AML related publications. The query searches for papers in which at least one of the selected genes is cocited with the AML disease and not cocited either with BC or DLCL diseases. The query produces, for each gene, the number of papers in which the selected condition is respected.


Reducing the complexity of complex gene coexpression networks by coupling multiweighted labeling with topological analysis.

Benso A, Cornale P, Di Carlo S, Politano G, Savino A - Biomed Res Int (2013)

ProteinQuest query example to obtain citation data for the AML dataset.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3814072&req=5

alg1: ProteinQuest query example to obtain citation data for the AML dataset.
Mentions: In order to perform a more solid statistical validation through the use of bibliometric data, we executed a set of queries on ProteinQuest to understand if, given a disease, the set of genes selected by our algorithm is highly cocited with the disease while showing low citation count with the other diseases. As an example, Algorithm 1 shows the query executed to search for citation relevance of AML genes with AML related publications. The query searches for papers in which at least one of the selected genes is cocited with the AML disease and not cocited either with BC or DLCL diseases. The query produces, for each gene, the number of papers in which the selected condition is respected.

Bottom Line: In order to infer relevant information, the network must be properly filtered and its complexity reduced.This paper proposes an efficient multivariate filtering designed to analyze the topological properties of a coexpression network in order to identify potential relevant genes for a given disease.Results have been validated resorting to bibliographic data automatically mined using the ProteinQuest literature mining tool.

View Article: PubMed Central - PubMed

Affiliation: Department of Controls and Computer Engineering, Politecnico di Torino, 10129 Torino, Italy ; Consorzio Interuniversitario Nazionale per l'Informatica, 11029 Verres, Italy.

ABSTRACT
Undirected gene coexpression networks obtained from experimental expression data coupled with efficient computational procedures are increasingly used to identify potentially relevant biological information (e.g., biomarkers) for a particular disease. However, coexpression networks built from experimental expression data are in general large highly connected networks with an elevated number of false-positive interactions (nodes and edges). In order to infer relevant information, the network must be properly filtered and its complexity reduced. Given the complexity and the multivariate nature of the information contained in the network, this requires the development and application of efficient feature selection algorithms to be able to exploit the topological characteristics of the network to identify relevant nodes and edges. This paper proposes an efficient multivariate filtering designed to analyze the topological properties of a coexpression network in order to identify potential relevant genes for a given disease. The algorithm has been tested on three datasets for three well known and studied diseases: acute myeloid leukemia, breast cancer, and diffuse large B-cell lymphoma. Results have been validated resorting to bibliographic data automatically mined using the ProteinQuest literature mining tool.

Show MeSH
Related in: MedlinePlus