Limits...
Structure and variation of the mitochondrial genome of fishes

View Article: PubMed Central - PubMed

ABSTRACT

Background: The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets.

Results: An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species.

Conclusions: Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.

Electronic supplementary material: The online version of this article (doi:10.1186/s12864-016-3054-y) contains supplementary material, which is available to authorized users.

No MeSH data available.


Linearized representation of the typical vertebrate gene order (circled t) and rearranged gene orders observed in fishes. All protein-coding genes are encoded on the H-strand with the exception of ND6 (underlined), which is encoded on the L-strand. Transfer RNA (tRNA) genes are designated by single-letter amino acid codes, and those encoded on the H- and L-strands are presented above and below the gene map, respectively. Capitalized A–G denote mt genome regions involved in major gene rearrangements. Coloured shadings (circled a-d) indicate fish groups that share the same unique gene order at least in part, and are related to Fig. 10. Numerals in front of the species name are the same as those in Additional file 1. 12S and 16S, 12S and 16S ribosomal RNA genes, respectively; ATPase 6 and 8, ATPase subunit 6 and 8 genes, respectively; COI–III, cytochrome c oxidase subunits I–III genes, respectively; CR, putative control region; Cyt b, cytochrome b gene; L1, L2, S1, and S2, tRNA Leu (UUR), tRNA Leu (CUN), tRNA Ser (UCN), and tRNA Ser (AGY) genes, respectively; NC, noncoding sequences of ≥50 bp; ND1–6 and 4 L, NADH dehydrogenase subunit 1–6 and 4 L genes, respectively
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC5015259&req=5

Fig1: Linearized representation of the typical vertebrate gene order (circled t) and rearranged gene orders observed in fishes. All protein-coding genes are encoded on the H-strand with the exception of ND6 (underlined), which is encoded on the L-strand. Transfer RNA (tRNA) genes are designated by single-letter amino acid codes, and those encoded on the H- and L-strands are presented above and below the gene map, respectively. Capitalized A–G denote mt genome regions involved in major gene rearrangements. Coloured shadings (circled a-d) indicate fish groups that share the same unique gene order at least in part, and are related to Fig. 10. Numerals in front of the species name are the same as those in Additional file 1. 12S and 16S, 12S and 16S ribosomal RNA genes, respectively; ATPase 6 and 8, ATPase subunit 6 and 8 genes, respectively; COI–III, cytochrome c oxidase subunits I–III genes, respectively; CR, putative control region; Cyt b, cytochrome b gene; L1, L2, S1, and S2, tRNA Leu (UUR), tRNA Leu (CUN), tRNA Ser (UCN), and tRNA Ser (AGY) genes, respectively; NC, noncoding sequences of ≥50 bp; ND1–6 and 4 L, NADH dehydrogenase subunit 1–6 and 4 L genes, respectively

Mentions: The 250 fish mt genomes compared in this study contained 37 genes (13 protein coding, 22 tRNA, and 2 rRNA genes) and 2 noncoding regions (CR and OL), as typically found in other vertebrates, with the exception of Limnichthys fasciatus (Barred sand burrower), the ND6 gene of which was not identified (Fig. 1: D: 215). The gene may be sandwiched between two CR-like regions as found in the mt genome of notothenioids fishes, whose ND6 gene was first missed and then found in the sandwiched region [46]. The ND6 gene in Limnichthys fasciatus may also have been missed during the PCR or sequence assembly. This possibility should be examined by genomic hybridization analysis. In addition, as observed in other vertebrates, most genes were encoded on the H-strand, excluding the ND6 gene and eight tRNA genes on the L-strand. In the following cases, sequences were not completely determined owing to existence of a long homopolymer (eg, TTTTTT…) in the mt genome that prevented sequencing reactions: the tRNA-Pro gene of Lampris guttatus (opah), the 12S rRNA gene of Brama japonica (Pacific pomfret), the ND1 gene of Synbranchus marmoratus (Marbled swamp eel), and the CR of 68 fishes.Fig. 1


Structure and variation of the mitochondrial genome of fishes
Linearized representation of the typical vertebrate gene order (circled t) and rearranged gene orders observed in fishes. All protein-coding genes are encoded on the H-strand with the exception of ND6 (underlined), which is encoded on the L-strand. Transfer RNA (tRNA) genes are designated by single-letter amino acid codes, and those encoded on the H- and L-strands are presented above and below the gene map, respectively. Capitalized A–G denote mt genome regions involved in major gene rearrangements. Coloured shadings (circled a-d) indicate fish groups that share the same unique gene order at least in part, and are related to Fig. 10. Numerals in front of the species name are the same as those in Additional file 1. 12S and 16S, 12S and 16S ribosomal RNA genes, respectively; ATPase 6 and 8, ATPase subunit 6 and 8 genes, respectively; COI–III, cytochrome c oxidase subunits I–III genes, respectively; CR, putative control region; Cyt b, cytochrome b gene; L1, L2, S1, and S2, tRNA Leu (UUR), tRNA Leu (CUN), tRNA Ser (UCN), and tRNA Ser (AGY) genes, respectively; NC, noncoding sequences of ≥50 bp; ND1–6 and 4 L, NADH dehydrogenase subunit 1–6 and 4 L genes, respectively
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC5015259&req=5

Fig1: Linearized representation of the typical vertebrate gene order (circled t) and rearranged gene orders observed in fishes. All protein-coding genes are encoded on the H-strand with the exception of ND6 (underlined), which is encoded on the L-strand. Transfer RNA (tRNA) genes are designated by single-letter amino acid codes, and those encoded on the H- and L-strands are presented above and below the gene map, respectively. Capitalized A–G denote mt genome regions involved in major gene rearrangements. Coloured shadings (circled a-d) indicate fish groups that share the same unique gene order at least in part, and are related to Fig. 10. Numerals in front of the species name are the same as those in Additional file 1. 12S and 16S, 12S and 16S ribosomal RNA genes, respectively; ATPase 6 and 8, ATPase subunit 6 and 8 genes, respectively; COI–III, cytochrome c oxidase subunits I–III genes, respectively; CR, putative control region; Cyt b, cytochrome b gene; L1, L2, S1, and S2, tRNA Leu (UUR), tRNA Leu (CUN), tRNA Ser (UCN), and tRNA Ser (AGY) genes, respectively; NC, noncoding sequences of ≥50 bp; ND1–6 and 4 L, NADH dehydrogenase subunit 1–6 and 4 L genes, respectively
Mentions: The 250 fish mt genomes compared in this study contained 37 genes (13 protein coding, 22 tRNA, and 2 rRNA genes) and 2 noncoding regions (CR and OL), as typically found in other vertebrates, with the exception of Limnichthys fasciatus (Barred sand burrower), the ND6 gene of which was not identified (Fig. 1: D: 215). The gene may be sandwiched between two CR-like regions as found in the mt genome of notothenioids fishes, whose ND6 gene was first missed and then found in the sandwiched region [46]. The ND6 gene in Limnichthys fasciatus may also have been missed during the PCR or sequence assembly. This possibility should be examined by genomic hybridization analysis. In addition, as observed in other vertebrates, most genes were encoded on the H-strand, excluding the ND6 gene and eight tRNA genes on the L-strand. In the following cases, sequences were not completely determined owing to existence of a long homopolymer (eg, TTTTTT…) in the mt genome that prevented sequencing reactions: the tRNA-Pro gene of Lampris guttatus (opah), the 12S rRNA gene of Brama japonica (Pacific pomfret), the ND1 gene of Synbranchus marmoratus (Marbled swamp eel), and the CR of 68 fishes.Fig. 1

View Article: PubMed Central - PubMed

ABSTRACT

Background: The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets.

Results: An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species.

Conclusions: Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.

Electronic supplementary material: The online version of this article (doi:10.1186/s12864-016-3054-y) contains supplementary material, which is available to authorized users.

No MeSH data available.