Limits...
A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks.

Xiang Z, Qin T, Qin ZS, He Y - BMC Syst Biol (2013)

Bottom Line: Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships.The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.

Results: The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists.

Conclusions: The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

Show MeSH

Related in: MedlinePlus

The GenoMesh algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3852244&req=5

Figure 1: The GenoMesh algorithm.

Mentions: The GenoMesh algorithm contains five steps as described in Methods and presented in Figure 1. Basically, using the titles, abstracts, and MeSH annotations of PubMed papers associated with one specific organism (e.g., E. coli), the GenoMesh algorithm calculates three matrices: gene-article matrix (Step 2 in Figure 1), gene-MeSH term matrix (Step 3), and gene-to-gene dissimilarity matrix (Step 4). The first gene-article matrix can be used for identifying the articles associating with any specific gene. Derived from the first matrix, the second gene-MeSH term matrix allows the association between MeSH terms and genes. Based on the second matrix, dissimilarity scores for any gene-gene association can be calculated. The dissimilarity scores determine how any two genes are dissociated. More details about how to implement the two organism examples (E. coli and Brucella) are described in the Methods section.


A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks.

Xiang Z, Qin T, Qin ZS, He Y - BMC Syst Biol (2013)

The GenoMesh algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3852244&req=5

Figure 1: The GenoMesh algorithm.
Mentions: The GenoMesh algorithm contains five steps as described in Methods and presented in Figure 1. Basically, using the titles, abstracts, and MeSH annotations of PubMed papers associated with one specific organism (e.g., E. coli), the GenoMesh algorithm calculates three matrices: gene-article matrix (Step 2 in Figure 1), gene-MeSH term matrix (Step 3), and gene-to-gene dissimilarity matrix (Step 4). The first gene-article matrix can be used for identifying the articles associating with any specific gene. Derived from the first matrix, the second gene-MeSH term matrix allows the association between MeSH terms and genes. Based on the second matrix, dissimilarity scores for any gene-gene association can be calculated. The dissimilarity scores determine how any two genes are dissociated. More details about how to implement the two organism examples (E. coli and Brucella) are described in the Methods section.

Bottom Line: Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships.The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.

Results: The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists.

Conclusions: The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

Show MeSH
Related in: MedlinePlus