Limits...
New insights about enzyme evolution from large scale studies of sequence and structure relationships.

Brown SD, Babbitt PC - J. Biol. Chem. (2014)

Bottom Line: Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites.Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily.The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.

View Article: PubMed Central - PubMed

Affiliation: From the Departments of Bioengineering and Therapeutic Sciences and.

Show MeSH
Representative sequence similarity network for the isoprenoid synthase I superfamily that is available from the Structure-Function Linkage Database (SFLD) (62). Each node (circle) represents a group of 1–732 sequences, where each sequence in a node is at least 50% identical to a seed sequence that defines that node (computed using the CD-HIT program (63)). The 2,499 nodes in this network represent over 16,000 sequences. Each edge (line) between two nodes indicates that the sequences represented by the connected nodes have a BLAST similarity score with an average −log(E-value) of 30 or more significance. At this −log (E-value) cutoff, alignments have an average length of 273 amino acids, and an average percent identity of 31%. Nodes are laid out in Cytoscape using the yFiles organic layout. A node is colored red if at least one constituent sequence represented by that node has a functional annotation in the Swiss-Prot database. A node is colored gray if no sequence in that representative node has a functional annotation in Swiss-Prot. Several clusters of nodes where no corresponding sequence has a functional annotation in Swiss-Prot are indicated with black ovals.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4215206&req=5

Figure 1: Representative sequence similarity network for the isoprenoid synthase I superfamily that is available from the Structure-Function Linkage Database (SFLD) (62). Each node (circle) represents a group of 1–732 sequences, where each sequence in a node is at least 50% identical to a seed sequence that defines that node (computed using the CD-HIT program (63)). The 2,499 nodes in this network represent over 16,000 sequences. Each edge (line) between two nodes indicates that the sequences represented by the connected nodes have a BLAST similarity score with an average −log(E-value) of 30 or more significance. At this −log (E-value) cutoff, alignments have an average length of 273 amino acids, and an average percent identity of 31%. Nodes are laid out in Cytoscape using the yFiles organic layout. A node is colored red if at least one constituent sequence represented by that node has a functional annotation in the Swiss-Prot database. A node is colored gray if no sequence in that representative node has a functional annotation in Swiss-Prot. Several clusters of nodes where no corresponding sequence has a functional annotation in Swiss-Prot are indicated with black ovals.

Mentions: Reflecting this relatively new approach, sequence similarity networks are used in some figures in this review (see Figs. 1 and 4) to enable exploration of structure-function relationships in enzyme superfamilies from a large scale perspective. In these networks, nodes represent one or more proteins, and edges between them represent a measure of sequence or structural similarity. Although not a substitute for phylogenetic trees, similarity networks provide several advantages over trees and multiple alignments for developing new hypotheses about the evolution of functional features in superfamilies. They are quick to construct, do not require an accurate multiple sequence alignment, and can summarize in one network relationships among thousands of sequences. The networks can also be visualized and interactively manipulated and explored using such software packages as Cytoscape (14). Although they are not based on an explicit evolutionary model, initial validation studies show that similarity networks correlate well with results from phylogenetic trees (15).


New insights about enzyme evolution from large scale studies of sequence and structure relationships.

Brown SD, Babbitt PC - J. Biol. Chem. (2014)

Representative sequence similarity network for the isoprenoid synthase I superfamily that is available from the Structure-Function Linkage Database (SFLD) (62). Each node (circle) represents a group of 1–732 sequences, where each sequence in a node is at least 50% identical to a seed sequence that defines that node (computed using the CD-HIT program (63)). The 2,499 nodes in this network represent over 16,000 sequences. Each edge (line) between two nodes indicates that the sequences represented by the connected nodes have a BLAST similarity score with an average −log(E-value) of 30 or more significance. At this −log (E-value) cutoff, alignments have an average length of 273 amino acids, and an average percent identity of 31%. Nodes are laid out in Cytoscape using the yFiles organic layout. A node is colored red if at least one constituent sequence represented by that node has a functional annotation in the Swiss-Prot database. A node is colored gray if no sequence in that representative node has a functional annotation in Swiss-Prot. Several clusters of nodes where no corresponding sequence has a functional annotation in Swiss-Prot are indicated with black ovals.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4215206&req=5

Figure 1: Representative sequence similarity network for the isoprenoid synthase I superfamily that is available from the Structure-Function Linkage Database (SFLD) (62). Each node (circle) represents a group of 1–732 sequences, where each sequence in a node is at least 50% identical to a seed sequence that defines that node (computed using the CD-HIT program (63)). The 2,499 nodes in this network represent over 16,000 sequences. Each edge (line) between two nodes indicates that the sequences represented by the connected nodes have a BLAST similarity score with an average −log(E-value) of 30 or more significance. At this −log (E-value) cutoff, alignments have an average length of 273 amino acids, and an average percent identity of 31%. Nodes are laid out in Cytoscape using the yFiles organic layout. A node is colored red if at least one constituent sequence represented by that node has a functional annotation in the Swiss-Prot database. A node is colored gray if no sequence in that representative node has a functional annotation in Swiss-Prot. Several clusters of nodes where no corresponding sequence has a functional annotation in Swiss-Prot are indicated with black ovals.
Mentions: Reflecting this relatively new approach, sequence similarity networks are used in some figures in this review (see Figs. 1 and 4) to enable exploration of structure-function relationships in enzyme superfamilies from a large scale perspective. In these networks, nodes represent one or more proteins, and edges between them represent a measure of sequence or structural similarity. Although not a substitute for phylogenetic trees, similarity networks provide several advantages over trees and multiple alignments for developing new hypotheses about the evolution of functional features in superfamilies. They are quick to construct, do not require an accurate multiple sequence alignment, and can summarize in one network relationships among thousands of sequences. The networks can also be visualized and interactively manipulated and explored using such software packages as Cytoscape (14). Although they are not based on an explicit evolutionary model, initial validation studies show that similarity networks correlate well with results from phylogenetic trees (15).

Bottom Line: Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites.Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily.The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.

View Article: PubMed Central - PubMed

Affiliation: From the Departments of Bioengineering and Therapeutic Sciences and.

Show MeSH