Limits...
Conservation analysis of the CydX protein yields insights into small protein identification and evolution.

Allen RJ, Brenner EP, VanOrsdel CE, Hobson JJ, Hearn DJ, Hemm MR - BMC Genomics (2014)

Bottom Line: Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex.Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Towson University, Towson 21252MD, USA. mhemm@towson.edu.

ABSTRACT

Background: The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.

Results: Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.

Conclusions: This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

Show MeSH

Related in: MedlinePlus

Newcyd-related small proteins identified in this study. (A) The CydY small protein found in Epsilon and Deltaproteobacter species downstream of cydAB operons encoding CydA with a long Q-loop. (B) The CydZ small protein found in over 150 cydAB operons encoding CydA with a short Q-loop. Operon organization is shown on top of each figure, with an example alignment shown below followed by a consensus sequence logo shown at the bottom of the figure. Species are as follows: Desulfurispirillum indicum S5 (“Desulfurispirillum”), Campylobacter concisus 13826 (“Campylobacter”), Sulfuricurvum kujiense DSM 16994 (“Sulfuricurvum”), Arcobacter butzleri RM4018 (“Arcobacter”), Campylobacter jejuni subsp. doylei 269.97 (“Campylobacter”), Serratia sp. AS12 (“Serratia”), Vibrio parahaemolyticus RIMD 2210663 (“Vibrio”), Enterobacter aerogenes KCTC 2190 (“Enterobacter”), Pseudomonas aeruginosa LESB58 (“Pseudomonas”), Achromobacter xylosoxidans A8 (“Achromobacter”), Bordetella parapertussis 12822 (“Bordetella”), Zymomonas mobilis subsp. mobilis ZM4 (“Zymomonas”). Sequence logos were generated using the program WebLogo [57]. Alignments were generated using the program MUSCLE [54]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4325964&req=5

Fig10: Newcyd-related small proteins identified in this study. (A) The CydY small protein found in Epsilon and Deltaproteobacter species downstream of cydAB operons encoding CydA with a long Q-loop. (B) The CydZ small protein found in over 150 cydAB operons encoding CydA with a short Q-loop. Operon organization is shown on top of each figure, with an example alignment shown below followed by a consensus sequence logo shown at the bottom of the figure. Species are as follows: Desulfurispirillum indicum S5 (“Desulfurispirillum”), Campylobacter concisus 13826 (“Campylobacter”), Sulfuricurvum kujiense DSM 16994 (“Sulfuricurvum”), Arcobacter butzleri RM4018 (“Arcobacter”), Campylobacter jejuni subsp. doylei 269.97 (“Campylobacter”), Serratia sp. AS12 (“Serratia”), Vibrio parahaemolyticus RIMD 2210663 (“Vibrio”), Enterobacter aerogenes KCTC 2190 (“Enterobacter”), Pseudomonas aeruginosa LESB58 (“Pseudomonas”), Achromobacter xylosoxidans A8 (“Achromobacter”), Bordetella parapertussis 12822 (“Bordetella”), Zymomonas mobilis subsp. mobilis ZM4 (“Zymomonas”). Sequence logos were generated using the program WebLogo [57]. Alignments were generated using the program MUSCLE [54]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.

Mentions: Since 11% of the cydAB operons that contain CydAQlong do not encode CydX, we hypothesized that these operons may encode one or more previously uncharacterized small proteins that could potentially serve similar functions to CydX. To test this possibility, we manually screened the 35 CydAQlong-containing operons to determine if there are other sORFs downstream of the cydB gene. In 15 species we identified a conserved sORF located downstream of cydB (Figure 6A, Additional files 1, 4 and 7) that could encode a small protein predicted to contain a transmembrane α-helix (Figure 10A and Additional file 7). Although the amino acid sequence of these small proteins is more divergent than CydX, all the proteins contain an absolutely conserved tryptophan located at the beginning of the conserved α-helix, similar to CydX (Figure 10A). All of the homologues we identified also contain strong ribosome binding sites (Additional file 7) and are encoded downstream of cydB, suggesting that they are transcribed with the operon and translated. In addition, when we examined the distribution of CydA proteins that contain this sORF, they grouped in a single clade adjacent to those containing CydX (Figure 8). Together, these data suggest that in these operons, a different small protein has evolved to function in the CydAB complex. We are referring to this protein as CydY.Figure 10


Conservation analysis of the CydX protein yields insights into small protein identification and evolution.

Allen RJ, Brenner EP, VanOrsdel CE, Hobson JJ, Hearn DJ, Hemm MR - BMC Genomics (2014)

Newcyd-related small proteins identified in this study. (A) The CydY small protein found in Epsilon and Deltaproteobacter species downstream of cydAB operons encoding CydA with a long Q-loop. (B) The CydZ small protein found in over 150 cydAB operons encoding CydA with a short Q-loop. Operon organization is shown on top of each figure, with an example alignment shown below followed by a consensus sequence logo shown at the bottom of the figure. Species are as follows: Desulfurispirillum indicum S5 (“Desulfurispirillum”), Campylobacter concisus 13826 (“Campylobacter”), Sulfuricurvum kujiense DSM 16994 (“Sulfuricurvum”), Arcobacter butzleri RM4018 (“Arcobacter”), Campylobacter jejuni subsp. doylei 269.97 (“Campylobacter”), Serratia sp. AS12 (“Serratia”), Vibrio parahaemolyticus RIMD 2210663 (“Vibrio”), Enterobacter aerogenes KCTC 2190 (“Enterobacter”), Pseudomonas aeruginosa LESB58 (“Pseudomonas”), Achromobacter xylosoxidans A8 (“Achromobacter”), Bordetella parapertussis 12822 (“Bordetella”), Zymomonas mobilis subsp. mobilis ZM4 (“Zymomonas”). Sequence logos were generated using the program WebLogo [57]. Alignments were generated using the program MUSCLE [54]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4325964&req=5

Fig10: Newcyd-related small proteins identified in this study. (A) The CydY small protein found in Epsilon and Deltaproteobacter species downstream of cydAB operons encoding CydA with a long Q-loop. (B) The CydZ small protein found in over 150 cydAB operons encoding CydA with a short Q-loop. Operon organization is shown on top of each figure, with an example alignment shown below followed by a consensus sequence logo shown at the bottom of the figure. Species are as follows: Desulfurispirillum indicum S5 (“Desulfurispirillum”), Campylobacter concisus 13826 (“Campylobacter”), Sulfuricurvum kujiense DSM 16994 (“Sulfuricurvum”), Arcobacter butzleri RM4018 (“Arcobacter”), Campylobacter jejuni subsp. doylei 269.97 (“Campylobacter”), Serratia sp. AS12 (“Serratia”), Vibrio parahaemolyticus RIMD 2210663 (“Vibrio”), Enterobacter aerogenes KCTC 2190 (“Enterobacter”), Pseudomonas aeruginosa LESB58 (“Pseudomonas”), Achromobacter xylosoxidans A8 (“Achromobacter”), Bordetella parapertussis 12822 (“Bordetella”), Zymomonas mobilis subsp. mobilis ZM4 (“Zymomonas”). Sequence logos were generated using the program WebLogo [57]. Alignments were generated using the program MUSCLE [54]. ‘*’ indicates that the residues are identical in all sequences and ‘:’ and ‘.’, respectively, indicated conserved and semi-conserved substitutions as defined by MUSCLE.
Mentions: Since 11% of the cydAB operons that contain CydAQlong do not encode CydX, we hypothesized that these operons may encode one or more previously uncharacterized small proteins that could potentially serve similar functions to CydX. To test this possibility, we manually screened the 35 CydAQlong-containing operons to determine if there are other sORFs downstream of the cydB gene. In 15 species we identified a conserved sORF located downstream of cydB (Figure 6A, Additional files 1, 4 and 7) that could encode a small protein predicted to contain a transmembrane α-helix (Figure 10A and Additional file 7). Although the amino acid sequence of these small proteins is more divergent than CydX, all the proteins contain an absolutely conserved tryptophan located at the beginning of the conserved α-helix, similar to CydX (Figure 10A). All of the homologues we identified also contain strong ribosome binding sites (Additional file 7) and are encoded downstream of cydB, suggesting that they are transcribed with the operon and translated. In addition, when we examined the distribution of CydA proteins that contain this sORF, they grouped in a single clade adjacent to those containing CydX (Figure 8). Together, these data suggest that in these operons, a different small protein has evolved to function in the CydAB complex. We are referring to this protein as CydY.Figure 10

Bottom Line: Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex.Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, Towson University, Towson 21252MD, USA. mhemm@towson.edu.

ABSTRACT

Background: The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.

Results: Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.

Conclusions: This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.

Show MeSH
Related in: MedlinePlus