Limits...
A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks.

Xiang Z, Qin T, Qin ZS, He Y - BMC Syst Biol (2013)

Bottom Line: Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships.The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.

Results: The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists.

Conclusions: The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

Show MeSH

Related in: MedlinePlus

Histogram analyses of average dissimilarity scores of random networks. The peaks and shapes of the curves are affected by the number of genes included in the random networks.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3852244&req=5

Figure 5: Histogram analyses of average dissimilarity scores of random networks. The peaks and shapes of the curves are affected by the number of genes included in the random networks.

Mentions: It was also found that the distribution of the gene-gene dissimilarities from randomly selected groups of E. coli genes approximates the normal distribution with the peak in the range of 0.96-0.98 (Figure 5). This normal distribution profile provides a rationale and confirmation of the useful application of the GenoMesh approach to analysis of biological networks.


A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks.

Xiang Z, Qin T, Qin ZS, He Y - BMC Syst Biol (2013)

Histogram analyses of average dissimilarity scores of random networks. The peaks and shapes of the curves are affected by the number of genes included in the random networks.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3852244&req=5

Figure 5: Histogram analyses of average dissimilarity scores of random networks. The peaks and shapes of the curves are affected by the number of genes included in the random networks.
Mentions: It was also found that the distribution of the gene-gene dissimilarities from randomly selected groups of E. coli genes approximates the normal distribution with the peak in the range of 0.96-0.98 (Figure 5). This normal distribution profile provides a rationale and confirmation of the useful application of the GenoMesh approach to analysis of biological networks.

Bottom Line: Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships.The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level.

Results: The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists.

Conclusions: The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.

Show MeSH
Related in: MedlinePlus