Limits...
NEIBank: genomics and bioinformatics resources for vision research.

Wistow G, Peterson K, Gao J, Buchoff P, Jaworski C, Bowes-Rickman C, Ebright JN, Hauser MA, Hoover D - Mol. Vis. (2008)

Bottom Line: NEIBank is an integrated resource for genomics and bioinformatics in vision research.All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser.NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology.

View Article: PubMed Central - PubMed

Affiliation: Section on Molecular Structure and Functional Genomics, National Eye Institute, National Institutes of Health, Bethesda, MD 20892-0703, USA. graeme@helix.nih.gov

ABSTRACT
NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology.

Show MeSH

Related in: MedlinePlus

Flowchart for GRouping and Identification of Sequence Tags (GRIST). High quality matches (HQM) under our default conditions are at least a 97% identity over a minimum length of 50 bp for NCBI RefSeq/NR (non-redundant) database matches and 96% identity over a minimum 100 bp length for NCBI dbEST database matches. Blast matches against NR are filtered to ignore multigene clones (such as bacterial artifical chromosomes [BACs]) and known artifacts. NR matches are checked for GeneID and are grouped with RefSeq matches for the same GeneID. This takes account of short or incomplete RefSeqs. Unigenes are assigned independently by BLAST against dbEST. UniGene assignments for the top eight HQM dbEST matches for each clone are identified, and those that occur at frequencies of at least 15% for the whole group are reported. This can help identify Unigene problems, overlapping genes, and variant transcripts.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2480482&req=5

f1: Flowchart for GRouping and Identification of Sequence Tags (GRIST). High quality matches (HQM) under our default conditions are at least a 97% identity over a minimum length of 50 bp for NCBI RefSeq/NR (non-redundant) database matches and 96% identity over a minimum 100 bp length for NCBI dbEST database matches. Blast matches against NR are filtered to ignore multigene clones (such as bacterial artifical chromosomes [BACs]) and known artifacts. NR matches are checked for GeneID and are grouped with RefSeq matches for the same GeneID. This takes account of short or incomplete RefSeqs. Unigenes are assigned independently by BLAST against dbEST. UniGene assignments for the top eight HQM dbEST matches for each clone are identified, and those that occur at frequencies of at least 15% for the whole group are reported. This can help identify Unigene problems, overlapping genes, and variant transcripts.

Mentions: EST data for NEIBank are analyzed and annotated using a procedure called GRIST: GRouping and Identification of Sequence Tags. GRIST uses a rule-based procedure mainly reliant on basic local alignment search tool (BLAST) [19,20] searches of National Center for Biotechnology Information (NCBI) DNA and protein databases [21] together with self-match searches of the clones in each library to assemble clusters or groups of ESTs corresponding to specific genes and to identify them independently through matches to GenBank and UniGene. GRIST also annotates clusters of cDNAs with Gene Ontology (GO) [22] terms and chromosomal and genomic location. Since its original description, GRIST has been implemented in Oracle 9i (Oracle Corporation, Redwood Shores, CA: a commercial structured query database implemented in Unix at NIH) and has been updated and modified to improve functionality and speed of processing. In particular, links to LocusLink (which has been discontinued) have been replaced with links to Entrez Gene; primary identification with full-length cDNA sequences now uses the NCBI human, mouse, and “other” reference sequence (RefSeq) databases (which speeds processing; both high level and detailed GO terms extracted from Entrez are included for functional annotation. The current GRIST pipeline is shown in Figure 1.


NEIBank: genomics and bioinformatics resources for vision research.

Wistow G, Peterson K, Gao J, Buchoff P, Jaworski C, Bowes-Rickman C, Ebright JN, Hauser MA, Hoover D - Mol. Vis. (2008)

Flowchart for GRouping and Identification of Sequence Tags (GRIST). High quality matches (HQM) under our default conditions are at least a 97% identity over a minimum length of 50 bp for NCBI RefSeq/NR (non-redundant) database matches and 96% identity over a minimum 100 bp length for NCBI dbEST database matches. Blast matches against NR are filtered to ignore multigene clones (such as bacterial artifical chromosomes [BACs]) and known artifacts. NR matches are checked for GeneID and are grouped with RefSeq matches for the same GeneID. This takes account of short or incomplete RefSeqs. Unigenes are assigned independently by BLAST against dbEST. UniGene assignments for the top eight HQM dbEST matches for each clone are identified, and those that occur at frequencies of at least 15% for the whole group are reported. This can help identify Unigene problems, overlapping genes, and variant transcripts.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2480482&req=5

f1: Flowchart for GRouping and Identification of Sequence Tags (GRIST). High quality matches (HQM) under our default conditions are at least a 97% identity over a minimum length of 50 bp for NCBI RefSeq/NR (non-redundant) database matches and 96% identity over a minimum 100 bp length for NCBI dbEST database matches. Blast matches against NR are filtered to ignore multigene clones (such as bacterial artifical chromosomes [BACs]) and known artifacts. NR matches are checked for GeneID and are grouped with RefSeq matches for the same GeneID. This takes account of short or incomplete RefSeqs. Unigenes are assigned independently by BLAST against dbEST. UniGene assignments for the top eight HQM dbEST matches for each clone are identified, and those that occur at frequencies of at least 15% for the whole group are reported. This can help identify Unigene problems, overlapping genes, and variant transcripts.
Mentions: EST data for NEIBank are analyzed and annotated using a procedure called GRIST: GRouping and Identification of Sequence Tags. GRIST uses a rule-based procedure mainly reliant on basic local alignment search tool (BLAST) [19,20] searches of National Center for Biotechnology Information (NCBI) DNA and protein databases [21] together with self-match searches of the clones in each library to assemble clusters or groups of ESTs corresponding to specific genes and to identify them independently through matches to GenBank and UniGene. GRIST also annotates clusters of cDNAs with Gene Ontology (GO) [22] terms and chromosomal and genomic location. Since its original description, GRIST has been implemented in Oracle 9i (Oracle Corporation, Redwood Shores, CA: a commercial structured query database implemented in Unix at NIH) and has been updated and modified to improve functionality and speed of processing. In particular, links to LocusLink (which has been discontinued) have been replaced with links to Entrez Gene; primary identification with full-length cDNA sequences now uses the NCBI human, mouse, and “other” reference sequence (RefSeq) databases (which speeds processing; both high level and detailed GO terms extracted from Entrez are included for functional annotation. The current GRIST pipeline is shown in Figure 1.

Bottom Line: NEIBank is an integrated resource for genomics and bioinformatics in vision research.All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser.NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology.

View Article: PubMed Central - PubMed

Affiliation: Section on Molecular Structure and Functional Genomics, National Eye Institute, National Institutes of Health, Bethesda, MD 20892-0703, USA. graeme@helix.nih.gov

ABSTRACT
NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology.

Show MeSH
Related in: MedlinePlus