Limits...
Signature proteins for the major clades of Cyanobacteria.

Gupta RS, Mathews DW - BMC Evol. Biol. (2010)

Bottom Line: We also describe 3 conserved indels in flavoprotein, heme oxygenase and protochlorophyllide oxidoreductase proteins that are specific for either Clade C cyanobacteria or for various subclades of Prochlorococcus.These signature proteins and indels provide novel means for circumscription of various cyanobacterial clades in clear molecular terms.Their functional studies should lead to discovery of novel properties that are unique to these groups of cyanobacteria.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada. gupta@mcmaster.ca

ABSTRACT

Background: The phylogeny and taxonomy of cyanobacteria is currently poorly understood due to paucity of reliable markers for identification and circumscription of its major clades.

Results: A combination of phylogenomic and protein signature based approaches was used to characterize the major clades of cyanobacteria. Phylogenetic trees were constructed for 44 cyanobacteria based on 44 conserved proteins. In parallel, Blastp searches were carried out on each ORF in the genomes of Synechococcus WH8102, Synechocystis PCC6803, Nostoc PCC7120, Synechococcus JA-3-3Ab, Prochlorococcus MIT9215 and Prochlor. marinus subsp. marinus CCMP1375 to identify proteins that are specific for various main clades of cyanobacteria. These studies have identified 39 proteins that are specific for all (or most) cyanobacteria and large numbers of proteins for other cyanobacterial clades. The identified signature proteins include: (i) 14 proteins for a deep branching clade (Clade A) of Gloebacter violaceus and two diazotrophic Synechococcus strains (JA-3-3Ab and JA2-3-B'a); (ii) 5 proteins that are present in all other cyanobacteria except those from Clade A; (iii) 60 proteins that are specific for a clade (Clade C) consisting of various marine unicellular cyanobacteria (viz. Synechococcus and Prochlorococcus); (iv) 14 and 19 signature proteins that are specific for the Clade C Synechococcus and Prochlorococcus strains, respectively; (v) 67 proteins that are specific for the Low B/A ecotype Prochlorococcus strains, containing lower ratio of chl b/a2 and adapted to growth at high light intensities; (vi) 65 and 8 proteins that are specific for the Nostocales and Chroococcales orders, respectively; and (vii) 22 and 9 proteins that are uniquely shared by various Nostocales and Oscillatoriales orders, or by these two orders and the Chroococcales, respectively. We also describe 3 conserved indels in flavoprotein, heme oxygenase and protochlorophyllide oxidoreductase proteins that are specific for either Clade C cyanobacteria or for various subclades of Prochlorococcus. Many other conserved indels for cyanobacterial clades have been described recently.

Conclusions: These signature proteins and indels provide novel means for circumscription of various cyanobacterial clades in clear molecular terms. Their functional studies should lead to discovery of novel properties that are unique to these groups of cyanobacteria.

Show MeSH
A maximum-likelihood distance tree for sequenced cyanobacteria based on concatenated sequences for 44 conserved proteins. The distance scale (bar = 0.1 substitutions per site) is shown in the top right hand corner. The tree was rooted using B. subtilis and S. aureus sequences. The numbers at the nodes indicate % of puzzling quartets supporting various nodes. The low B/A ecotype clade refers to the Prochlorococcus spp. containing lower ratio of chlorophyll b/a2 that are adapted to growth at high light intensities.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2823733&req=5

Figure 1: A maximum-likelihood distance tree for sequenced cyanobacteria based on concatenated sequences for 44 conserved proteins. The distance scale (bar = 0.1 substitutions per site) is shown in the top right hand corner. The tree was rooted using B. subtilis and S. aureus sequences. The numbers at the nodes indicate % of puzzling quartets supporting various nodes. The low B/A ecotype clade refers to the Prochlorococcus spp. containing lower ratio of chlorophyll b/a2 that are adapted to growth at high light intensities.

Mentions: A rooted maximum likelihood (ML) distance tree based on the combined sequences for these proteins is shown in Fig. 1 and a neighbour-joining (NJ) tree for the same dataset is provided as additional file 2. A number of distinct clades of cyanobacteria were observed in both these trees. Very similar branching patterns and the grouping of cyanobacterial species in various clades have been observed in earlier studies based on other large and independent datasets of protein sequences [4,11,12], giving confidence in the observed results. One of the observed clades, referred to here as Clade A, consists of Gloebacter violaceus and Synechococcus sps. (JA-3-3Ab and JA2-3-B'a). The ML and NJ tree differ from each other in the branching position of this clade. In the ML tree, the Clade A species/strains formed the deepest branching lineage within cyanobacteria. In contrast, in the NJ tree, the cyanobacteria were divided into two main clades at the deepest level and the Clade A formed the outermost branch of one of these clades, separated from all other species/strains by a long branch (additional file 2). However, the branching of Clade A in this position is not reliable, as in our recent studies based on the same dataset of protein sequences but with smaller numbers of cyanobacteria, the clade A species/strains branched in the same position as seen here in the ML tree [23]. The deep branching of Clade A species/strains has also been observed in a number of earlier studies based on different datasets of protein sequences [4,6,11,12,23,36-39]. Further strong and independent evidence that the Clade A species/strains constitutes the earliest branching lineage within sequenced cyanobacteria is provided by our recent identification of several conserved indels in broadly distributed proteins (viz. 18 aa insert in DNA polymerase I, 4-5 aa insert in the tryptophan synthase beta chain, 4 aa insert in tryptophanyl-tRNA synthetase and a 2 aa insert in the DNA polymerase III) [23]. The indicated conserved inserts in these proteins are commonly shared by all other sequenced cyanobacteria, but they are lacking in Clade A as well as all other phyla of bacteria [23]. The species distributions of these conserved indels strongly indicate that these synapomorphies were introduced in a common ancestor of various other cyanobacteria after the branching of Clade A. In a recent proposal for the classification of cyanobacteria, the thylakoids lacking Gloebacterales are placed into a separate subclass (Gloebacterophycidae) [15]. It is unclear whether the Synechococcus sps. (JA-3-3Ab and JA2-3-B'a), which group with G. violaceus, also lack thylakoids or not.


Signature proteins for the major clades of Cyanobacteria.

Gupta RS, Mathews DW - BMC Evol. Biol. (2010)

A maximum-likelihood distance tree for sequenced cyanobacteria based on concatenated sequences for 44 conserved proteins. The distance scale (bar = 0.1 substitutions per site) is shown in the top right hand corner. The tree was rooted using B. subtilis and S. aureus sequences. The numbers at the nodes indicate % of puzzling quartets supporting various nodes. The low B/A ecotype clade refers to the Prochlorococcus spp. containing lower ratio of chlorophyll b/a2 that are adapted to growth at high light intensities.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2823733&req=5

Figure 1: A maximum-likelihood distance tree for sequenced cyanobacteria based on concatenated sequences for 44 conserved proteins. The distance scale (bar = 0.1 substitutions per site) is shown in the top right hand corner. The tree was rooted using B. subtilis and S. aureus sequences. The numbers at the nodes indicate % of puzzling quartets supporting various nodes. The low B/A ecotype clade refers to the Prochlorococcus spp. containing lower ratio of chlorophyll b/a2 that are adapted to growth at high light intensities.
Mentions: A rooted maximum likelihood (ML) distance tree based on the combined sequences for these proteins is shown in Fig. 1 and a neighbour-joining (NJ) tree for the same dataset is provided as additional file 2. A number of distinct clades of cyanobacteria were observed in both these trees. Very similar branching patterns and the grouping of cyanobacterial species in various clades have been observed in earlier studies based on other large and independent datasets of protein sequences [4,11,12], giving confidence in the observed results. One of the observed clades, referred to here as Clade A, consists of Gloebacter violaceus and Synechococcus sps. (JA-3-3Ab and JA2-3-B'a). The ML and NJ tree differ from each other in the branching position of this clade. In the ML tree, the Clade A species/strains formed the deepest branching lineage within cyanobacteria. In contrast, in the NJ tree, the cyanobacteria were divided into two main clades at the deepest level and the Clade A formed the outermost branch of one of these clades, separated from all other species/strains by a long branch (additional file 2). However, the branching of Clade A in this position is not reliable, as in our recent studies based on the same dataset of protein sequences but with smaller numbers of cyanobacteria, the clade A species/strains branched in the same position as seen here in the ML tree [23]. The deep branching of Clade A species/strains has also been observed in a number of earlier studies based on different datasets of protein sequences [4,6,11,12,23,36-39]. Further strong and independent evidence that the Clade A species/strains constitutes the earliest branching lineage within sequenced cyanobacteria is provided by our recent identification of several conserved indels in broadly distributed proteins (viz. 18 aa insert in DNA polymerase I, 4-5 aa insert in the tryptophan synthase beta chain, 4 aa insert in tryptophanyl-tRNA synthetase and a 2 aa insert in the DNA polymerase III) [23]. The indicated conserved inserts in these proteins are commonly shared by all other sequenced cyanobacteria, but they are lacking in Clade A as well as all other phyla of bacteria [23]. The species distributions of these conserved indels strongly indicate that these synapomorphies were introduced in a common ancestor of various other cyanobacteria after the branching of Clade A. In a recent proposal for the classification of cyanobacteria, the thylakoids lacking Gloebacterales are placed into a separate subclass (Gloebacterophycidae) [15]. It is unclear whether the Synechococcus sps. (JA-3-3Ab and JA2-3-B'a), which group with G. violaceus, also lack thylakoids or not.

Bottom Line: We also describe 3 conserved indels in flavoprotein, heme oxygenase and protochlorophyllide oxidoreductase proteins that are specific for either Clade C cyanobacteria or for various subclades of Prochlorococcus.These signature proteins and indels provide novel means for circumscription of various cyanobacterial clades in clear molecular terms.Their functional studies should lead to discovery of novel properties that are unique to these groups of cyanobacteria.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada. gupta@mcmaster.ca

ABSTRACT

Background: The phylogeny and taxonomy of cyanobacteria is currently poorly understood due to paucity of reliable markers for identification and circumscription of its major clades.

Results: A combination of phylogenomic and protein signature based approaches was used to characterize the major clades of cyanobacteria. Phylogenetic trees were constructed for 44 cyanobacteria based on 44 conserved proteins. In parallel, Blastp searches were carried out on each ORF in the genomes of Synechococcus WH8102, Synechocystis PCC6803, Nostoc PCC7120, Synechococcus JA-3-3Ab, Prochlorococcus MIT9215 and Prochlor. marinus subsp. marinus CCMP1375 to identify proteins that are specific for various main clades of cyanobacteria. These studies have identified 39 proteins that are specific for all (or most) cyanobacteria and large numbers of proteins for other cyanobacterial clades. The identified signature proteins include: (i) 14 proteins for a deep branching clade (Clade A) of Gloebacter violaceus and two diazotrophic Synechococcus strains (JA-3-3Ab and JA2-3-B'a); (ii) 5 proteins that are present in all other cyanobacteria except those from Clade A; (iii) 60 proteins that are specific for a clade (Clade C) consisting of various marine unicellular cyanobacteria (viz. Synechococcus and Prochlorococcus); (iv) 14 and 19 signature proteins that are specific for the Clade C Synechococcus and Prochlorococcus strains, respectively; (v) 67 proteins that are specific for the Low B/A ecotype Prochlorococcus strains, containing lower ratio of chl b/a2 and adapted to growth at high light intensities; (vi) 65 and 8 proteins that are specific for the Nostocales and Chroococcales orders, respectively; and (vii) 22 and 9 proteins that are uniquely shared by various Nostocales and Oscillatoriales orders, or by these two orders and the Chroococcales, respectively. We also describe 3 conserved indels in flavoprotein, heme oxygenase and protochlorophyllide oxidoreductase proteins that are specific for either Clade C cyanobacteria or for various subclades of Prochlorococcus. Many other conserved indels for cyanobacterial clades have been described recently.

Conclusions: These signature proteins and indels provide novel means for circumscription of various cyanobacterial clades in clear molecular terms. Their functional studies should lead to discovery of novel properties that are unique to these groups of cyanobacteria.

Show MeSH