Limits...
The world bacterial biogeography and biodiversity through databases: a case study of NCBI Nucleotide Database and GBIF Database.

Selama O, James P, Nateche F, Wellington EM, Hacène H - Biomed Res Int (2013)

Bottom Line: These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database.Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented.This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

View Article: PubMed Central - PubMed

Affiliation: Microbiology Group, Laboratory of Cellular and Molecular Biology, Faculty of Biological Sciences, USTHB, BP 32, EL ALIA, Bab Ezzouar, Algiers, Algeria.

ABSTRACT
Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record). These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

Show MeSH

Related in: MedlinePlus

The world biogeography (a) by continent in (a1). GBIF Database. (a2). NCBI Nucleotide Database. (b) By country in (b1). GBIF Database and (b2). NCBI Nucleotide Database.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3818805&req=5

fig2: The world biogeography (a) by continent in (a1). GBIF Database. (a2). NCBI Nucleotide Database. (b) By country in (b1). GBIF Database and (b2). NCBI Nucleotide Database.

Mentions: Table 4 shows the occurrences of records by continent for both NCBI Nucleotide and GBIF databases. The American continent has the largest number of records submitted, representing 39% of all registered records in GBIF Database and more than 50% in the NCBI Nucleotide Database, yet only half 634,225 of these NCBI Nucleotide records are assigned to one of the 24 phyla. Europe with 27% and Australia-Oceania with 16% are second and third, respectively, for the contribution of the GBIF data input, while Asia is more likely to contribute records in the NCBI Nucleotide Database with 21%, ranking second than to the GBIF Database 11%. Antarctica is less involved with 1% and 4% of the world bacterial biodiversity being registered for GBIF or NCBI Nucleotide databases, respectively. Finally, there is nearly 3% of data registration from Africa in each database. The world maps for bacterial biogeography regarding continents are illustrated in Figures 2(a1) and 2(a2).


The world bacterial biogeography and biodiversity through databases: a case study of NCBI Nucleotide Database and GBIF Database.

Selama O, James P, Nateche F, Wellington EM, Hacène H - Biomed Res Int (2013)

The world biogeography (a) by continent in (a1). GBIF Database. (a2). NCBI Nucleotide Database. (b) By country in (b1). GBIF Database and (b2). NCBI Nucleotide Database.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3818805&req=5

fig2: The world biogeography (a) by continent in (a1). GBIF Database. (a2). NCBI Nucleotide Database. (b) By country in (b1). GBIF Database and (b2). NCBI Nucleotide Database.
Mentions: Table 4 shows the occurrences of records by continent for both NCBI Nucleotide and GBIF databases. The American continent has the largest number of records submitted, representing 39% of all registered records in GBIF Database and more than 50% in the NCBI Nucleotide Database, yet only half 634,225 of these NCBI Nucleotide records are assigned to one of the 24 phyla. Europe with 27% and Australia-Oceania with 16% are second and third, respectively, for the contribution of the GBIF data input, while Asia is more likely to contribute records in the NCBI Nucleotide Database with 21%, ranking second than to the GBIF Database 11%. Antarctica is less involved with 1% and 4% of the world bacterial biodiversity being registered for GBIF or NCBI Nucleotide databases, respectively. Finally, there is nearly 3% of data registration from Africa in each database. The world maps for bacterial biogeography regarding continents are illustrated in Figures 2(a1) and 2(a2).

Bottom Line: These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database.Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented.This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

View Article: PubMed Central - PubMed

Affiliation: Microbiology Group, Laboratory of Cellular and Molecular Biology, Faculty of Biological Sciences, USTHB, BP 32, EL ALIA, Bab Ezzouar, Algiers, Algeria.

ABSTRACT
Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record). These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

Show MeSH
Related in: MedlinePlus