Limits...
dbPTB: a database for preterm birth.

Uzun A, Laliberte A, Parker J, Andrew C, Winterrowd E, Sharma S, Istrail S, Padbury JF - Database (Oxford) (2012)

Bottom Line: We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases.Pathway analysis was used to impute genes from pathways identified in the curations.The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.

View Article: PubMed Central - PubMed

Affiliation: Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA.

ABSTRACT
Genome-wide association studies (GWAS) query the entire genome in a hypothesis-free, unbiased manner. Since they have the potential for identifying novel genetic variants, they have become a very popular approach to the investigation of complex diseases. Nonetheless, since the success of the GWAS approach varies widely, the identification of genetic variants for complex diseases remains a difficult problem. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases. To test the feasibility of our approach, we developed a web-based aggregation tool to organize the genes, genetic variations and pathways involved in preterm birth. We used semantic data mining to extract all published articles related to preterm birth. All articles were reviewed by a team of curators. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information form a unique resource for investigators interested in preterm birth. The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.

Show MeSH
(A) Workflow for retrieval of articles, curation and extraction of genes from literature, microarray data and gene interpolation for pathway analysis. (B) Total number of genes, their associated original sources and number of unique pathways represented.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3275764&req=5

bar069-F1: (A) Workflow for retrieval of articles, curation and extraction of genes from literature, microarray data and gene interpolation for pathway analysis. (B) Total number of genes, their associated original sources and number of unique pathways represented.

Mentions: Inter-rater reliability was assessed and κ scores were measured after training (23, 24). Inter-rater reliability was maintained by formal, weekly ‘curation meetings’ where difficult publications, or any publication a curation team member felt would be useful for discussion and comparison, were reviewed conjointly. We designed and built a separate database for the curation process, which allowed remote login, password protected access to full text of the articles via the Brown University Library eJournals collection. This allowed annotation of the articles, putative genes, SNPs and variants contained in the extracted papers. Since the curation database allowed curators to work remotely, it significantly accelerated the process of curation. Articles which are accepted for preterm birth immediately become accessible to dbPTB queries along with all the relevant genetic data (Figure 1). An algorithmic description of the curation process in detail is shown in Supplementary files.Figure 1.


dbPTB: a database for preterm birth.

Uzun A, Laliberte A, Parker J, Andrew C, Winterrowd E, Sharma S, Istrail S, Padbury JF - Database (Oxford) (2012)

(A) Workflow for retrieval of articles, curation and extraction of genes from literature, microarray data and gene interpolation for pathway analysis. (B) Total number of genes, their associated original sources and number of unique pathways represented.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3275764&req=5

bar069-F1: (A) Workflow for retrieval of articles, curation and extraction of genes from literature, microarray data and gene interpolation for pathway analysis. (B) Total number of genes, their associated original sources and number of unique pathways represented.
Mentions: Inter-rater reliability was assessed and κ scores were measured after training (23, 24). Inter-rater reliability was maintained by formal, weekly ‘curation meetings’ where difficult publications, or any publication a curation team member felt would be useful for discussion and comparison, were reviewed conjointly. We designed and built a separate database for the curation process, which allowed remote login, password protected access to full text of the articles via the Brown University Library eJournals collection. This allowed annotation of the articles, putative genes, SNPs and variants contained in the extracted papers. Since the curation database allowed curators to work remotely, it significantly accelerated the process of curation. Articles which are accepted for preterm birth immediately become accessible to dbPTB queries along with all the relevant genetic data (Figure 1). An algorithmic description of the curation process in detail is shown in Supplementary files.Figure 1.

Bottom Line: We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases.Pathway analysis was used to impute genes from pathways identified in the curations.The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.

View Article: PubMed Central - PubMed

Affiliation: Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA.

ABSTRACT
Genome-wide association studies (GWAS) query the entire genome in a hypothesis-free, unbiased manner. Since they have the potential for identifying novel genetic variants, they have become a very popular approach to the investigation of complex diseases. Nonetheless, since the success of the GWAS approach varies widely, the identification of genetic variants for complex diseases remains a difficult problem. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases. To test the feasibility of our approach, we developed a web-based aggregation tool to organize the genes, genetic variations and pathways involved in preterm birth. We used semantic data mining to extract all published articles related to preterm birth. All articles were reviewed by a team of curators. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information form a unique resource for investigators interested in preterm birth. The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.

Show MeSH