Limits...
T3SEdb: data warehousing of virulence effectors secreted by the bacterial Type III Secretion System.

Tay DM, Govindarajan KR, Khan AM, Ong TY, Samad HM, Soh WW, Tong M, Zhang F, Tan TW - BMC Bioinformatics (2010)

Bottom Line: We created a reliable effector prediction tool, integrated into the database, to demonstrate the application of the database for such endeavours.T3SEdb is the first specialised database reported for T3SS effectors, enriched with manual annotations that facilitated systematic construction of a reliable prediction model for identification of novel effectors.The T3SEdb represents a platform for inclusion of additional annotations of metadata for future developments of sophisticated effector prediction models for screening and selection of putative novel effectors from bacterial genomes/proteomes that can be validated by a small number of key experiments.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.

ABSTRACT

Background: Effectors of Type III Secretion System (T3SS) play a pivotal role in establishing and maintaining pathogenicity in the host and therefore the identification of these effectors is important in understanding virulence. However, the effectors display high level of sequence diversity, therefore making the identification a difficult process. There is a need to collate and annotate existing effector sequences in public databases to enable systematic analyses of these sequences for development of models for screening and selection of putative novel effectors from bacterial genomes that can be validated by a smaller number of key experiments.

Results: Herein, we present T3SEdb http://effectors.bic.nus.edu.sg/T3SEdb, a specialized database of annotated T3SS effector (T3SE) sequences containing 1089 records from 46 bacterial species compiled from the literature and public protein databases. Procedures have been defined for i) comprehensive annotation of experimental status of effectors, ii) submission and curation review of records by users of the database, and iii) the regular update of T3SEdb existing and new records. Keyword fielded and sequence searches (BLAST, regular expression) are supported for both experimentally verified and hypothetical T3SEs. More than 171 clusters of T3SEs were detected based on sequence identity comparisons (intra-cluster difference up to ~60%). Owing to this high level of sequence diversity of T3SEs, the T3SEdb provides a large number of experimentally known effector sequences with wide species representation for creation of effector predictors. We created a reliable effector prediction tool, integrated into the database, to demonstrate the application of the database for such endeavours.

Conclusions: T3SEdb is the first specialised database reported for T3SS effectors, enriched with manual annotations that facilitated systematic construction of a reliable prediction model for identification of novel effectors. The T3SEdb represents a platform for inclusion of additional annotations of metadata for future developments of sophisticated effector prediction models for screening and selection of putative novel effectors from bacterial genomes/proteomes that can be validated by a small number of key experiments.

Show MeSH

Related in: MedlinePlus

T3SEdb search function and sample output page. A) The database can be queried via the NCBI accession number, domain or general keyword search which can also be restricted to the experimental status of the sequences (experimentally validated or hypothetical) or to a specific field in the sequence record. B) Search results display database record with T3SEdb accession number, effector name, hyperlinked NCBI Entrez Protein database accession number, source organism of the effector, sequence length, experimental status, last sequence update, name and accession of the primary/source database that the effector was retrieved from, sequence data, literature references (hyperlinked PubMed IDs) and T3SEdb curation comments (if any).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2957687&req=5

Figure 1: T3SEdb search function and sample output page. A) The database can be queried via the NCBI accession number, domain or general keyword search which can also be restricted to the experimental status of the sequences (experimentally validated or hypothetical) or to a specific field in the sequence record. B) Search results display database record with T3SEdb accession number, effector name, hyperlinked NCBI Entrez Protein database accession number, source organism of the effector, sequence length, experimental status, last sequence update, name and accession of the primary/source database that the effector was retrieved from, sequence data, literature references (hyperlinked PubMed IDs) and T3SEdb curation comments (if any).

Mentions: Users can dynamically browse and string match search the database via the dynamic AJAX/JQuery data request calls to the server. Advanced specific queries are also supported: users can query the database via the NCBI accession number, perform domain or general keyword search, which can also be restricted to the experimental status of the sequences or to a specific field in the sequence record (Figure 1A). Search results are presented in a tabular form, displaying T3SEdb accession number, effector name, hyperlinked NCBI Entrez Protein database accession number, source organism of the effector, sequence length, experimental status, last sequence update, name and accession of the primary/source database that the effector was retrieved from, sequence data, literature references (hyperlinked PubMed IDs) and T3SEdb curation comments (if any) (Figure 1B). Sequence similarity search function against the experimental and hypothetical sequences using the BLAST tool is also provided. Users can batch retrieve sequence data of experimentally confirmed and hypothetical effectors. For curated input to the database by users, a web-interface for submission of new T3SEs is provided with submission and curation review policy indicated http://effectors.bic.nus.edu.sg/T3SEdb/usercurationpolicy.php. A policy on regular update of T3SEdb existing and new records is also defined http://effectors.bic.nus.edu.sg/T3SEdb/updatepolicy.php. Statistics are dynamically updated providing up-to-date general information on the records in the T3SEdb, such as the number of records, the rate of deposition of new effector records into the NCBI Entrez Protein database over the years (1990 to 2010), the list of source species for the effector sequences and the number of experimentally verified and hypothetical sequences classified according to each species.


T3SEdb: data warehousing of virulence effectors secreted by the bacterial Type III Secretion System.

Tay DM, Govindarajan KR, Khan AM, Ong TY, Samad HM, Soh WW, Tong M, Zhang F, Tan TW - BMC Bioinformatics (2010)

T3SEdb search function and sample output page. A) The database can be queried via the NCBI accession number, domain or general keyword search which can also be restricted to the experimental status of the sequences (experimentally validated or hypothetical) or to a specific field in the sequence record. B) Search results display database record with T3SEdb accession number, effector name, hyperlinked NCBI Entrez Protein database accession number, source organism of the effector, sequence length, experimental status, last sequence update, name and accession of the primary/source database that the effector was retrieved from, sequence data, literature references (hyperlinked PubMed IDs) and T3SEdb curation comments (if any).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2957687&req=5

Figure 1: T3SEdb search function and sample output page. A) The database can be queried via the NCBI accession number, domain or general keyword search which can also be restricted to the experimental status of the sequences (experimentally validated or hypothetical) or to a specific field in the sequence record. B) Search results display database record with T3SEdb accession number, effector name, hyperlinked NCBI Entrez Protein database accession number, source organism of the effector, sequence length, experimental status, last sequence update, name and accession of the primary/source database that the effector was retrieved from, sequence data, literature references (hyperlinked PubMed IDs) and T3SEdb curation comments (if any).
Mentions: Users can dynamically browse and string match search the database via the dynamic AJAX/JQuery data request calls to the server. Advanced specific queries are also supported: users can query the database via the NCBI accession number, perform domain or general keyword search, which can also be restricted to the experimental status of the sequences or to a specific field in the sequence record (Figure 1A). Search results are presented in a tabular form, displaying T3SEdb accession number, effector name, hyperlinked NCBI Entrez Protein database accession number, source organism of the effector, sequence length, experimental status, last sequence update, name and accession of the primary/source database that the effector was retrieved from, sequence data, literature references (hyperlinked PubMed IDs) and T3SEdb curation comments (if any) (Figure 1B). Sequence similarity search function against the experimental and hypothetical sequences using the BLAST tool is also provided. Users can batch retrieve sequence data of experimentally confirmed and hypothetical effectors. For curated input to the database by users, a web-interface for submission of new T3SEs is provided with submission and curation review policy indicated http://effectors.bic.nus.edu.sg/T3SEdb/usercurationpolicy.php. A policy on regular update of T3SEdb existing and new records is also defined http://effectors.bic.nus.edu.sg/T3SEdb/updatepolicy.php. Statistics are dynamically updated providing up-to-date general information on the records in the T3SEdb, such as the number of records, the rate of deposition of new effector records into the NCBI Entrez Protein database over the years (1990 to 2010), the list of source species for the effector sequences and the number of experimentally verified and hypothetical sequences classified according to each species.

Bottom Line: We created a reliable effector prediction tool, integrated into the database, to demonstrate the application of the database for such endeavours.T3SEdb is the first specialised database reported for T3SS effectors, enriched with manual annotations that facilitated systematic construction of a reliable prediction model for identification of novel effectors.The T3SEdb represents a platform for inclusion of additional annotations of metadata for future developments of sophisticated effector prediction models for screening and selection of putative novel effectors from bacterial genomes/proteomes that can be validated by a small number of key experiments.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.

ABSTRACT

Background: Effectors of Type III Secretion System (T3SS) play a pivotal role in establishing and maintaining pathogenicity in the host and therefore the identification of these effectors is important in understanding virulence. However, the effectors display high level of sequence diversity, therefore making the identification a difficult process. There is a need to collate and annotate existing effector sequences in public databases to enable systematic analyses of these sequences for development of models for screening and selection of putative novel effectors from bacterial genomes that can be validated by a smaller number of key experiments.

Results: Herein, we present T3SEdb http://effectors.bic.nus.edu.sg/T3SEdb, a specialized database of annotated T3SS effector (T3SE) sequences containing 1089 records from 46 bacterial species compiled from the literature and public protein databases. Procedures have been defined for i) comprehensive annotation of experimental status of effectors, ii) submission and curation review of records by users of the database, and iii) the regular update of T3SEdb existing and new records. Keyword fielded and sequence searches (BLAST, regular expression) are supported for both experimentally verified and hypothetical T3SEs. More than 171 clusters of T3SEs were detected based on sequence identity comparisons (intra-cluster difference up to ~60%). Owing to this high level of sequence diversity of T3SEs, the T3SEdb provides a large number of experimentally known effector sequences with wide species representation for creation of effector predictors. We created a reliable effector prediction tool, integrated into the database, to demonstrate the application of the database for such endeavours.

Conclusions: T3SEdb is the first specialised database reported for T3SS effectors, enriched with manual annotations that facilitated systematic construction of a reliable prediction model for identification of novel effectors. The T3SEdb represents a platform for inclusion of additional annotations of metadata for future developments of sophisticated effector prediction models for screening and selection of putative novel effectors from bacterial genomes/proteomes that can be validated by a small number of key experiments.

Show MeSH
Related in: MedlinePlus