Limits...
CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes.

Huang Y, Lau SK, Woo PC, Yuen KY - Nucleic Acids Res. (2007)

Bottom Line: Sequences can be directly downloaded from the website in FASTA format.For complete genomes, a single representative sequence for each species is available for comparative analysis such as phylogenetic studies.With the annotated sequences in CoVDB, more specific blast search results can be generated for efficient downstream analysis.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology, Research Centre of Infection and Immunology and State Key Laboratory of Emerging Infectious Diseases, The University of Hong Kong, Hong Kong.

ABSTRACT
The recent SARS epidemic has boosted interest in the discovery of novel human and animal coronaviruses. By July 2007, more than 3000 coronavirus sequence records, including 264 complete genomes, are available in GenBank. The number of coronavirus species with complete genomes available has increased from 9 in 2003 to 25 in 2007, of which six, including coronavirus HKU1, bat SARS coronavirus, group 1 bat coronavirus HKU2, groups 2c and 2d coronaviruses, were sequenced by our laboratory. To overcome the problems we encountered in the existing databases during comparative sequence analysis, we built a comprehensive database, CoVDB (http://covdb.microbiology.hku.hk), of annotated coronavirus genes and genomes. CoVDB provides a convenient platform for rapid and accurate batch sequence retrieval, the cornerstone and bottleneck for comparative gene or genome analysis. Sequences can be directly downloaded from the website in FASTA format. CoVDB also provides detailed annotation of all coronavirus sequences using a standardized nomenclature system, and overcomes the problems of duplicated and identical sequences in other databases. For complete genomes, a single representative sequence for each species is available for comparative analysis such as phylogenetic studies. With the annotated sequences in CoVDB, more specific blast search results can be generated for efficient downstream analysis.

Show MeSH

Related in: MedlinePlus

Screenshots of all gene retrieval pages. (a) Gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the name of the genes. The numbers next to each checkbox indicates the number of that gene in CoVDB. The option ‘Exclude partial CDS’ can be used if only complete genes are required. (b) Example of showing the 15 sequences of nsp13 in group 3 coronaviruses. The first column is CoVDB gene id. In the Uniq column, ‘Uniq’ will be shown if there is no other identical sequence in CoVDB. Otherwise, gene id of the sequences identical to it will be shown.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2238867&req=5

Figure 3: Screenshots of all gene retrieval pages. (a) Gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the name of the genes. The numbers next to each checkbox indicates the number of that gene in CoVDB. The option ‘Exclude partial CDS’ can be used if only complete genes are required. (b) Example of showing the 15 sequences of nsp13 in group 3 coronaviruses. The first column is CoVDB gene id. In the Uniq column, ‘Uniq’ will be shown if there is no other identical sequence in CoVDB. Otherwise, gene id of the sequences identical to it will be shown.

Mentions: From the page for retrieval of complete genomes and their genes, one can enter the second main page for retrieval of all complete and/or incomplete genes of a coronavirus (Figure 3a) by clicking ‘From all groups of genes’. In this page, all the gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the names of the genes. The option ‘Exclude partial CDS’ can be used if only complete genes are required. An example of retrieving all the sequence of a particular gene for a group of coronavirus is shown in Figure 3b. If the translated sequence of a selected gene has more than one stop codon which is probably due to sequencing error, the number in the ‘Length’ column of this gene will be marked in red.Figure 3.


CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes.

Huang Y, Lau SK, Woo PC, Yuen KY - Nucleic Acids Res. (2007)

Screenshots of all gene retrieval pages. (a) Gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the name of the genes. The numbers next to each checkbox indicates the number of that gene in CoVDB. The option ‘Exclude partial CDS’ can be used if only complete genes are required. (b) Example of showing the 15 sequences of nsp13 in group 3 coronaviruses. The first column is CoVDB gene id. In the Uniq column, ‘Uniq’ will be shown if there is no other identical sequence in CoVDB. Otherwise, gene id of the sequences identical to it will be shown.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2238867&req=5

Figure 3: Screenshots of all gene retrieval pages. (a) Gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the name of the genes. The numbers next to each checkbox indicates the number of that gene in CoVDB. The option ‘Exclude partial CDS’ can be used if only complete genes are required. (b) Example of showing the 15 sequences of nsp13 in group 3 coronaviruses. The first column is CoVDB gene id. In the Uniq column, ‘Uniq’ will be shown if there is no other identical sequence in CoVDB. Otherwise, gene id of the sequences identical to it will be shown.
Mentions: From the page for retrieval of complete genomes and their genes, one can enter the second main page for retrieval of all complete and/or incomplete genes of a coronavirus (Figure 3a) by clicking ‘From all groups of genes’. In this page, all the gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the names of the genes. The option ‘Exclude partial CDS’ can be used if only complete genes are required. An example of retrieving all the sequence of a particular gene for a group of coronavirus is shown in Figure 3b. If the translated sequence of a selected gene has more than one stop codon which is probably due to sequencing error, the number in the ‘Length’ column of this gene will be marked in red.Figure 3.

Bottom Line: Sequences can be directly downloaded from the website in FASTA format.For complete genomes, a single representative sequence for each species is available for comparative analysis such as phylogenetic studies.With the annotated sequences in CoVDB, more specific blast search results can be generated for efficient downstream analysis.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology, Research Centre of Infection and Immunology and State Key Laboratory of Emerging Infectious Diseases, The University of Hong Kong, Hong Kong.

ABSTRACT
The recent SARS epidemic has boosted interest in the discovery of novel human and animal coronaviruses. By July 2007, more than 3000 coronavirus sequence records, including 264 complete genomes, are available in GenBank. The number of coronavirus species with complete genomes available has increased from 9 in 2003 to 25 in 2007, of which six, including coronavirus HKU1, bat SARS coronavirus, group 1 bat coronavirus HKU2, groups 2c and 2d coronaviruses, were sequenced by our laboratory. To overcome the problems we encountered in the existing databases during comparative sequence analysis, we built a comprehensive database, CoVDB (http://covdb.microbiology.hku.hk), of annotated coronavirus genes and genomes. CoVDB provides a convenient platform for rapid and accurate batch sequence retrieval, the cornerstone and bottleneck for comparative gene or genome analysis. Sequences can be directly downloaded from the website in FASTA format. CoVDB also provides detailed annotation of all coronavirus sequences using a standardized nomenclature system, and overcomes the problems of duplicated and identical sequences in other databases. For complete genomes, a single representative sequence for each species is available for comparative analysis such as phylogenetic studies. With the annotated sequences in CoVDB, more specific blast search results can be generated for efficient downstream analysis.

Show MeSH
Related in: MedlinePlus