Limits...
Conservation analysis of the CydX protein yields insights into small protein identification and evolution.

Allen RJ, Brenner EP, VanOrsdel CE, Hobson JJ, Hearn DJ, Hemm MR - BMC Genomics (2014)

Bottom Line: Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex.Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Towson University, Towson 21252MD, USA. mhemm@towson.edu.

ABSTRACT

Background: The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.

Results: Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.

Conclusions: This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

Show MeSH

Related in: MedlinePlus

Phylogenetic analysis of CydX. (A) Phylogenetic analysis was conducted using concatenated CydABX protein sequences, and clades of CydABX sequences with strong statistical support are labeled by color. (B) Species containing specific CydABX sequences are labeled on the phylogenetic tree using bars of the same color as their clade in the phylogenetic analysis of the CydABX sequences. Species containing CydX homologues that are not contained in a cydABX operon are labeled with a black bar. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacter phylum. (C) Alignment of protein sequences of CydX homologues grouped into the “yellow clade” in the phylogenetic analysis. (D) Alignment of select protein sequences of CydX homologues grouped into the “grey clade” in the phylogenetic analysis. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny. Species are as follows: Pseudoalteromonas haloplanktis TAC125 (“Psuedoalteromonas(1)”), Pseudoalteromonas sp. SM9913 (“Pseudoalteromonas(2)”), Glaciecola sp. 4H-3-7 + YE-5 (“Glaciecola”), Pseudoalteromonas atlantica T6c (”Pseudoalteromonas(3)”), Allochromatium vinosum DSM 180 (”Allochromatium”), Colwellia psychrerythraea 34H (“Colwellia”), Rhodospirillum photometricum DSM 122 (“Rhodospirillum”), Thiomonas intermedia K12 (“Thiomonas”), Bordetella avium 197 N (“Bordetella”), Frateuria aurantia DSM 6220 (“Frateuria”), Acidiphillium cryptum JF-5 (“Acidiphilium(1)”), Acidiphillium multivorum AIU301 (“Acidiphilium(2)”), Acidithiobacillus ferrooxidans ATCC 53993 (“Acidithiobacillus(1)”), Acidithiobacillus caldus SM-1 (“Acidithiobacillus(2)”), Acetobacter pasteurianus IFO 3283–01 (“Acetobacter(1)”), and Acetobacter pasteurianus IFO 3283-01-42C (“Acetobacter(2)”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4325964&req=5

Fig7: Phylogenetic analysis of CydX. (A) Phylogenetic analysis was conducted using concatenated CydABX protein sequences, and clades of CydABX sequences with strong statistical support are labeled by color. (B) Species containing specific CydABX sequences are labeled on the phylogenetic tree using bars of the same color as their clade in the phylogenetic analysis of the CydABX sequences. Species containing CydX homologues that are not contained in a cydABX operon are labeled with a black bar. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacter phylum. (C) Alignment of protein sequences of CydX homologues grouped into the “yellow clade” in the phylogenetic analysis. (D) Alignment of select protein sequences of CydX homologues grouped into the “grey clade” in the phylogenetic analysis. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny. Species are as follows: Pseudoalteromonas haloplanktis TAC125 (“Psuedoalteromonas(1)”), Pseudoalteromonas sp. SM9913 (“Pseudoalteromonas(2)”), Glaciecola sp. 4H-3-7 + YE-5 (“Glaciecola”), Pseudoalteromonas atlantica T6c (”Pseudoalteromonas(3)”), Allochromatium vinosum DSM 180 (”Allochromatium”), Colwellia psychrerythraea 34H (“Colwellia”), Rhodospirillum photometricum DSM 122 (“Rhodospirillum”), Thiomonas intermedia K12 (“Thiomonas”), Bordetella avium 197 N (“Bordetella”), Frateuria aurantia DSM 6220 (“Frateuria”), Acidiphillium cryptum JF-5 (“Acidiphilium(1)”), Acidiphillium multivorum AIU301 (“Acidiphilium(2)”), Acidithiobacillus ferrooxidans ATCC 53993 (“Acidithiobacillus(1)”), Acidithiobacillus caldus SM-1 (“Acidithiobacillus(2)”), Acetobacter pasteurianus IFO 3283–01 (“Acetobacter(1)”), and Acetobacter pasteurianus IFO 3283-01-42C (“Acetobacter(2)”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.

Mentions: To investigate the prevalence of CydX horizontal gene transfer, we attempted to create a phylogenetic tree based on CydX and superimpose this tree on a reference phylogenetic tree of the 1095 taxa screened in the study. In this way we could identify instances where the phylogenetic relationships between CydX homologues were incongruous with the overall phylogeny, which may be an indication of horizontal gene transfer. The maximum likelihood bootstrap values for the trees based solely on the CydX amino acid or DNA sequence were low and provided insufficient confidence to infer relationships among gene copies, let alone horizontal transfer events (unpublished data). These low bootstrap values are likely due to the limited sequence available for comparison and the high sequence variability between CydX homologues. To overcome this problem we took into consideration that, outside of a few orphan genes, there were no observed gains or losses of cydX independent of cydAB. Thus, the cydABX operon might be considered as one evolutionary unit, and a phylogeny could be constructed based on the concatenated sequences of all three proteins. A phylogenetic tree of 280 concatenated protein sequences was constructed using this methodology and produced a tree with 11 main clades with higher than 80% bootstrap support (Figure 7A). The CydX sequences within each operon were then aligned separately to identify sequence homologies that were unique for each clade. Protein alignments and sequence logos of the CydX proteins within the clades show that each contains shared and derived sequence motifs (synapomorphies) (Additional file 6), supporting the idea that the operon phylogeny accurately reflects that of CydX.Figure 7


Conservation analysis of the CydX protein yields insights into small protein identification and evolution.

Allen RJ, Brenner EP, VanOrsdel CE, Hobson JJ, Hearn DJ, Hemm MR - BMC Genomics (2014)

Phylogenetic analysis of CydX. (A) Phylogenetic analysis was conducted using concatenated CydABX protein sequences, and clades of CydABX sequences with strong statistical support are labeled by color. (B) Species containing specific CydABX sequences are labeled on the phylogenetic tree using bars of the same color as their clade in the phylogenetic analysis of the CydABX sequences. Species containing CydX homologues that are not contained in a cydABX operon are labeled with a black bar. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacter phylum. (C) Alignment of protein sequences of CydX homologues grouped into the “yellow clade” in the phylogenetic analysis. (D) Alignment of select protein sequences of CydX homologues grouped into the “grey clade” in the phylogenetic analysis. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny. Species are as follows: Pseudoalteromonas haloplanktis TAC125 (“Psuedoalteromonas(1)”), Pseudoalteromonas sp. SM9913 (“Pseudoalteromonas(2)”), Glaciecola sp. 4H-3-7 + YE-5 (“Glaciecola”), Pseudoalteromonas atlantica T6c (”Pseudoalteromonas(3)”), Allochromatium vinosum DSM 180 (”Allochromatium”), Colwellia psychrerythraea 34H (“Colwellia”), Rhodospirillum photometricum DSM 122 (“Rhodospirillum”), Thiomonas intermedia K12 (“Thiomonas”), Bordetella avium 197 N (“Bordetella”), Frateuria aurantia DSM 6220 (“Frateuria”), Acidiphillium cryptum JF-5 (“Acidiphilium(1)”), Acidiphillium multivorum AIU301 (“Acidiphilium(2)”), Acidithiobacillus ferrooxidans ATCC 53993 (“Acidithiobacillus(1)”), Acidithiobacillus caldus SM-1 (“Acidithiobacillus(2)”), Acetobacter pasteurianus IFO 3283–01 (“Acetobacter(1)”), and Acetobacter pasteurianus IFO 3283-01-42C (“Acetobacter(2)”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4325964&req=5

Fig7: Phylogenetic analysis of CydX. (A) Phylogenetic analysis was conducted using concatenated CydABX protein sequences, and clades of CydABX sequences with strong statistical support are labeled by color. (B) Species containing specific CydABX sequences are labeled on the phylogenetic tree using bars of the same color as their clade in the phylogenetic analysis of the CydABX sequences. Species containing CydX homologues that are not contained in a cydABX operon are labeled with a black bar. The Alpha, Beta, Epsilon, Delta and Gamma labels identify the different classes in the Proteobacter phylum. (C) Alignment of protein sequences of CydX homologues grouped into the “yellow clade” in the phylogenetic analysis. (D) Alignment of select protein sequences of CydX homologues grouped into the “grey clade” in the phylogenetic analysis. Gene names and sequences are shaded corresponding to the color used for that clade in the preceding phylogeny. Species are as follows: Pseudoalteromonas haloplanktis TAC125 (“Psuedoalteromonas(1)”), Pseudoalteromonas sp. SM9913 (“Pseudoalteromonas(2)”), Glaciecola sp. 4H-3-7 + YE-5 (“Glaciecola”), Pseudoalteromonas atlantica T6c (”Pseudoalteromonas(3)”), Allochromatium vinosum DSM 180 (”Allochromatium”), Colwellia psychrerythraea 34H (“Colwellia”), Rhodospirillum photometricum DSM 122 (“Rhodospirillum”), Thiomonas intermedia K12 (“Thiomonas”), Bordetella avium 197 N (“Bordetella”), Frateuria aurantia DSM 6220 (“Frateuria”), Acidiphillium cryptum JF-5 (“Acidiphilium(1)”), Acidiphillium multivorum AIU301 (“Acidiphilium(2)”), Acidithiobacillus ferrooxidans ATCC 53993 (“Acidithiobacillus(1)”), Acidithiobacillus caldus SM-1 (“Acidithiobacillus(2)”), Acetobacter pasteurianus IFO 3283–01 (“Acetobacter(1)”), and Acetobacter pasteurianus IFO 3283-01-42C (“Acetobacter(2)”). Alignments were generated using the program MUSCLE [57]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
Mentions: To investigate the prevalence of CydX horizontal gene transfer, we attempted to create a phylogenetic tree based on CydX and superimpose this tree on a reference phylogenetic tree of the 1095 taxa screened in the study. In this way we could identify instances where the phylogenetic relationships between CydX homologues were incongruous with the overall phylogeny, which may be an indication of horizontal gene transfer. The maximum likelihood bootstrap values for the trees based solely on the CydX amino acid or DNA sequence were low and provided insufficient confidence to infer relationships among gene copies, let alone horizontal transfer events (unpublished data). These low bootstrap values are likely due to the limited sequence available for comparison and the high sequence variability between CydX homologues. To overcome this problem we took into consideration that, outside of a few orphan genes, there were no observed gains or losses of cydX independent of cydAB. Thus, the cydABX operon might be considered as one evolutionary unit, and a phylogeny could be constructed based on the concatenated sequences of all three proteins. A phylogenetic tree of 280 concatenated protein sequences was constructed using this methodology and produced a tree with 11 main clades with higher than 80% bootstrap support (Figure 7A). The CydX sequences within each operon were then aligned separately to identify sequence homologies that were unique for each clade. Protein alignments and sequence logos of the CydX proteins within the clades show that each contains shared and derived sequence motifs (synapomorphies) (Additional file 6), supporting the idea that the operon phylogeny accurately reflects that of CydX.Figure 7

Bottom Line: Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex.Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Towson University, Towson 21252MD, USA. mhemm@towson.edu.

ABSTRACT

Background: The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.

Results: Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.

Conclusions: This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

Show MeSH
Related in: MedlinePlus