Limits...
Conservation analysis of the CydX protein yields insights into small protein identification and evolution.

Allen RJ, Brenner EP, VanOrsdel CE, Hobson JJ, Hearn DJ, Hemm MR - BMC Genomics (2014)

Bottom Line: Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex.Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Towson University, Towson 21252MD, USA. mhemm@towson.edu.

ABSTRACT

Background: The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.

Results: Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.

Conclusions: This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

Show MeSH

Related in: MedlinePlus

Distribution ofcydA, cydB,cydXand othercyd-related small proteins throughout bacteria. (A) Phylogenetic tree of 1095 species from major Eubacterial clades overlaid with the presence of the different cyd genes in each species. Gene identification in a bacterial genome are labeled as follows: species adjacent to a red bar contain at least one cydA gene, to a blue bar contain at least one cydB gene, to a green bar contain at least one cydX gene, those adjacent to an yellow bar contain at least one cydZ gene, and those adjacent to a black bar contain at least one cydY gene. Major bacterial clades are labeled. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacteria phylum. (B) Alignment of representative homologues identified from major bacterial clades. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny, while pISP1 and pRLG204 are not colored because they are not represented in the tree. Species are as follows: Shigella flexneri 2a str. 2457 T (“Enterobacteriaceae”), Legionella pneumonophila 2300/99 Alcoy (“Legionellaceae”), Hyphomonas neptunium ATCC15444 (“Hyphomonadaceae”), Asticcacaulis excentricus CB 48 (“Caulobacteraceae”), Laribacter hongkongensis HLHK9 (“Neisseriaceae”), Archromobacter xylosoxidans A8 (“Alcaligenaceae”), Mariprofundus ferrooxydans PV-1 1099921033905 (Mariprofundaceae), Sphingomonas sp. MM-1 plasmid pISP1 (“pISP1”), and Rhizonbium leguminosarum bs. trifolii WSM2304 plasmid pRLG204 (“pRLG204”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4325964&req=5

Fig6: Distribution ofcydA, cydB,cydXand othercyd-related small proteins throughout bacteria. (A) Phylogenetic tree of 1095 species from major Eubacterial clades overlaid with the presence of the different cyd genes in each species. Gene identification in a bacterial genome are labeled as follows: species adjacent to a red bar contain at least one cydA gene, to a blue bar contain at least one cydB gene, to a green bar contain at least one cydX gene, those adjacent to an yellow bar contain at least one cydZ gene, and those adjacent to a black bar contain at least one cydY gene. Major bacterial clades are labeled. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacteria phylum. (B) Alignment of representative homologues identified from major bacterial clades. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny, while pISP1 and pRLG204 are not colored because they are not represented in the tree. Species are as follows: Shigella flexneri 2a str. 2457 T (“Enterobacteriaceae”), Legionella pneumonophila 2300/99 Alcoy (“Legionellaceae”), Hyphomonas neptunium ATCC15444 (“Hyphomonadaceae”), Asticcacaulis excentricus CB 48 (“Caulobacteraceae”), Laribacter hongkongensis HLHK9 (“Neisseriaceae”), Archromobacter xylosoxidans A8 (“Alcaligenaceae”), Mariprofundus ferrooxydans PV-1 1099921033905 (Mariprofundaceae), Sphingomonas sp. MM-1 plasmid pISP1 (“pISP1”), and Rhizonbium leguminosarum bs. trifolii WSM2304 plasmid pRLG204 (“pRLG204”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.

Mentions: Of the 1095 species screened in our original analysis, all 259 CydX-containing species were found to be members of the Alpha, Beta, and Gamma classes of the Proteobacteria. In contrast to the cladistically-limited distribution of CydX, CydA and CydB homologues were identified in species that range through almost all phyla included in the analysis (Figure 6A and Additional file 4). The difference in distribution between CydX and CydAB is consistent with the idea that CydAB evolved earlier than the CydX small protein. Given the evolutionary model that Alpha, Beta, and Gamma classes diverged after the earlier branching of Delta and Epsilonproteobacteria [32], the distribution of CydX suggests that it may have evolved in association with the cydAB operon in a progenitor of the Alpha, Beta, and Gammaproteobacteria clades.Figure 6


Conservation analysis of the CydX protein yields insights into small protein identification and evolution.

Allen RJ, Brenner EP, VanOrsdel CE, Hobson JJ, Hearn DJ, Hemm MR - BMC Genomics (2014)

Distribution ofcydA, cydB,cydXand othercyd-related small proteins throughout bacteria. (A) Phylogenetic tree of 1095 species from major Eubacterial clades overlaid with the presence of the different cyd genes in each species. Gene identification in a bacterial genome are labeled as follows: species adjacent to a red bar contain at least one cydA gene, to a blue bar contain at least one cydB gene, to a green bar contain at least one cydX gene, those adjacent to an yellow bar contain at least one cydZ gene, and those adjacent to a black bar contain at least one cydY gene. Major bacterial clades are labeled. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacteria phylum. (B) Alignment of representative homologues identified from major bacterial clades. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny, while pISP1 and pRLG204 are not colored because they are not represented in the tree. Species are as follows: Shigella flexneri 2a str. 2457 T (“Enterobacteriaceae”), Legionella pneumonophila 2300/99 Alcoy (“Legionellaceae”), Hyphomonas neptunium ATCC15444 (“Hyphomonadaceae”), Asticcacaulis excentricus CB 48 (“Caulobacteraceae”), Laribacter hongkongensis HLHK9 (“Neisseriaceae”), Archromobacter xylosoxidans A8 (“Alcaligenaceae”), Mariprofundus ferrooxydans PV-1 1099921033905 (Mariprofundaceae), Sphingomonas sp. MM-1 plasmid pISP1 (“pISP1”), and Rhizonbium leguminosarum bs. trifolii WSM2304 plasmid pRLG204 (“pRLG204”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4325964&req=5

Fig6: Distribution ofcydA, cydB,cydXand othercyd-related small proteins throughout bacteria. (A) Phylogenetic tree of 1095 species from major Eubacterial clades overlaid with the presence of the different cyd genes in each species. Gene identification in a bacterial genome are labeled as follows: species adjacent to a red bar contain at least one cydA gene, to a blue bar contain at least one cydB gene, to a green bar contain at least one cydX gene, those adjacent to an yellow bar contain at least one cydZ gene, and those adjacent to a black bar contain at least one cydY gene. Major bacterial clades are labeled. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacteria phylum. (B) Alignment of representative homologues identified from major bacterial clades. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny, while pISP1 and pRLG204 are not colored because they are not represented in the tree. Species are as follows: Shigella flexneri 2a str. 2457 T (“Enterobacteriaceae”), Legionella pneumonophila 2300/99 Alcoy (“Legionellaceae”), Hyphomonas neptunium ATCC15444 (“Hyphomonadaceae”), Asticcacaulis excentricus CB 48 (“Caulobacteraceae”), Laribacter hongkongensis HLHK9 (“Neisseriaceae”), Archromobacter xylosoxidans A8 (“Alcaligenaceae”), Mariprofundus ferrooxydans PV-1 1099921033905 (Mariprofundaceae), Sphingomonas sp. MM-1 plasmid pISP1 (“pISP1”), and Rhizonbium leguminosarum bs. trifolii WSM2304 plasmid pRLG204 (“pRLG204”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
Mentions: Of the 1095 species screened in our original analysis, all 259 CydX-containing species were found to be members of the Alpha, Beta, and Gamma classes of the Proteobacteria. In contrast to the cladistically-limited distribution of CydX, CydA and CydB homologues were identified in species that range through almost all phyla included in the analysis (Figure 6A and Additional file 4). The difference in distribution between CydX and CydAB is consistent with the idea that CydAB evolved earlier than the CydX small protein. Given the evolutionary model that Alpha, Beta, and Gamma classes diverged after the earlier branching of Delta and Epsilonproteobacteria [32], the distribution of CydX suggests that it may have evolved in association with the cydAB operon in a progenitor of the Alpha, Beta, and Gammaproteobacteria clades.Figure 6

Bottom Line: Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex.Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Towson University, Towson 21252MD, USA. mhemm@towson.edu.

ABSTRACT

Background: The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.

Results: Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.

Conclusions: This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

Show MeSH
Related in: MedlinePlus