Limits...
Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria.

Santamaria M, Vicario S, Pappadà G, Scioscia G, Scazzocchio C, Saccone C - BMC Bioinformatics (2009)

Bottom Line: A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality.A new query approach has been developed to retrieve effectively introns information included in these entries.Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals.

View Article: PubMed Central - HTML - PubMed

Affiliation: CNR - Istituto di Tecnologie Biomediche, Sede di Bari, Via Amendola 122/D, Bari, 70126, Italy. monica.santamaria@ba.itb.cnr.it

ABSTRACT

Background: A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinformatic surveys is due to the presence of mobile introns in almost all the fungal mitochondrial genes. The aim of this work is to verify the incidence of this phenomenon in Ascomycota, testing, at the same time, a new bioinformatic tool for extracting and managing sequence databases annotations, in order to identify the mitochondrial gene regions where introns are missing so as to propose them as species markers.

Methods: The general trend towards a large occurrence of introns in the mitochondrial genome of Fungi has been confirmed in Ascomycota by an extensive bioinformatic analysis, performed on all the entries concerning 11 mitochondrial protein coding genes and 2 mitochondrial rRNA (ribosomal RNA) specifying genes, belonging to this phylum, available in public nucleotide sequence databases. A new query approach has been developed to retrieve effectively introns information included in these entries.

Results: After comparing the new query-based approach with a blast-based procedure, with the aim of designing a faithful Ascomycota mitochondrial intron map, the first method appeared clearly the most accurate. Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals.

Conclusion: The development of the new query system described here would answer the pressing requirement to improve drastically the bioinformatics support to the DNA Barcode Initiative. The large scale investigation of Ascomycota mitochondrial introns performed through this tool, allowing to exclude the introns-rich sequences from the barcode candidates exploration, could be the first step towards a mitochondrial barcoding strategy for these organisms, similar to the standard approach employed in metazoans.

Show MeSH
Intron size distribution as estimated by the Blast-based protocol. The distribution of size in each gene (genes acronyms are marked on the left size of the graph) is shown with a so-called "barcode graph", in which each small vertical bar represent the size value of a given intron and each dot on the top of the vertical bar represent further introns which have the same size. From the distribution of intron sizes one record (AY955840 and its equivalent from full genome collection NC_007935) was removed because its size value was too large (>50000 bp) to be easily drawn on the graph.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2697638&req=5

Figure 5: Intron size distribution as estimated by the Blast-based protocol. The distribution of size in each gene (genes acronyms are marked on the left size of the graph) is shown with a so-called "barcode graph", in which each small vertical bar represent the size value of a given intron and each dot on the top of the vertical bar represent further introns which have the same size. From the distribution of intron sizes one record (AY955840 and its equivalent from full genome collection NC_007935) was removed because its size value was too large (>50000 bp) to be easily drawn on the graph.

Mentions: Looking at the distribution of intron sizes (The results of our analysis are shown in Figure 5), the values range between 500 and 3000 bp, with few exceptions, the median value is 1269 and the estimated mode is 1240.457. An intrinsic problem of the Blast-based approach, which is based on the recognition of sequence similarity (the similarity score between the probe and each sequence in the database depends on the fraction of identical residues and on the lengths of the matching regions), is that small exons could not be identified and several introns could be joined together producing erroneous results, as shown in Figure 5 where unrealistic introns about even over 25000 bp emerge. For this reason, a query-based approach seemed to be immediately desirable in order to calculate the intron positions and sizes directly from database annotations.


Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria.

Santamaria M, Vicario S, Pappadà G, Scioscia G, Scazzocchio C, Saccone C - BMC Bioinformatics (2009)

Intron size distribution as estimated by the Blast-based protocol. The distribution of size in each gene (genes acronyms are marked on the left size of the graph) is shown with a so-called "barcode graph", in which each small vertical bar represent the size value of a given intron and each dot on the top of the vertical bar represent further introns which have the same size. From the distribution of intron sizes one record (AY955840 and its equivalent from full genome collection NC_007935) was removed because its size value was too large (>50000 bp) to be easily drawn on the graph.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2697638&req=5

Figure 5: Intron size distribution as estimated by the Blast-based protocol. The distribution of size in each gene (genes acronyms are marked on the left size of the graph) is shown with a so-called "barcode graph", in which each small vertical bar represent the size value of a given intron and each dot on the top of the vertical bar represent further introns which have the same size. From the distribution of intron sizes one record (AY955840 and its equivalent from full genome collection NC_007935) was removed because its size value was too large (>50000 bp) to be easily drawn on the graph.
Mentions: Looking at the distribution of intron sizes (The results of our analysis are shown in Figure 5), the values range between 500 and 3000 bp, with few exceptions, the median value is 1269 and the estimated mode is 1240.457. An intrinsic problem of the Blast-based approach, which is based on the recognition of sequence similarity (the similarity score between the probe and each sequence in the database depends on the fraction of identical residues and on the lengths of the matching regions), is that small exons could not be identified and several introns could be joined together producing erroneous results, as shown in Figure 5 where unrealistic introns about even over 25000 bp emerge. For this reason, a query-based approach seemed to be immediately desirable in order to calculate the intron positions and sizes directly from database annotations.

Bottom Line: A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality.A new query approach has been developed to retrieve effectively introns information included in these entries.Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals.

View Article: PubMed Central - HTML - PubMed

Affiliation: CNR - Istituto di Tecnologie Biomediche, Sede di Bari, Via Amendola 122/D, Bari, 70126, Italy. monica.santamaria@ba.itb.cnr.it

ABSTRACT

Background: A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinformatic surveys is due to the presence of mobile introns in almost all the fungal mitochondrial genes. The aim of this work is to verify the incidence of this phenomenon in Ascomycota, testing, at the same time, a new bioinformatic tool for extracting and managing sequence databases annotations, in order to identify the mitochondrial gene regions where introns are missing so as to propose them as species markers.

Methods: The general trend towards a large occurrence of introns in the mitochondrial genome of Fungi has been confirmed in Ascomycota by an extensive bioinformatic analysis, performed on all the entries concerning 11 mitochondrial protein coding genes and 2 mitochondrial rRNA (ribosomal RNA) specifying genes, belonging to this phylum, available in public nucleotide sequence databases. A new query approach has been developed to retrieve effectively introns information included in these entries.

Results: After comparing the new query-based approach with a blast-based procedure, with the aim of designing a faithful Ascomycota mitochondrial intron map, the first method appeared clearly the most accurate. Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals.

Conclusion: The development of the new query system described here would answer the pressing requirement to improve drastically the bioinformatics support to the DNA Barcode Initiative. The large scale investigation of Ascomycota mitochondrial introns performed through this tool, allowing to exclude the introns-rich sequences from the barcode candidates exploration, could be the first step towards a mitochondrial barcoding strategy for these organisms, similar to the standard approach employed in metazoans.

Show MeSH