Limits...
Assessing the ability of sequence-based methods to provide functional insight within membrane integral proteins: a case study analyzing the neurotransmitter/Na+ symporter family.

Livesay DR, Kidd PD, Eskandari S, Roshan U - BMC Bioinformatics (2007)

Bottom Line: Interestingly, the various prediction schemes provide results that are predominantly orthogonal to each other.However, when the methods do provide overlapping results, specificity is shown to increase dramatically (e.g., sites predicted by any three methods have both accuracy and coverage greater than 50%).The results presented herein clearly establish the viability of sequence-based bioinformatic strategies to provide functional insight within the NSS family.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Bioinformatics Research Center, University of North Carolina at Charlotte, Charlotte, NC 28262, USA. drlivesa@uncc.edu

ABSTRACT

Background: Efforts to predict functional sites from globular proteins is increasingly common; however, the most successful of these methods generally require structural insight. Unfortunately, despite several recent technological advances, structural coverage of membrane integral proteins continues to be sparse. ConSequently, sequence-based methods represent an important alternative to illuminate functional roles. In this report, we critically examine the ability of several computational methods to provide functional insight within two specific areas. First, can phylogenomic methods accurately describe the functional diversity across a membrane integral protein family? And second, can sequence-based strategies accurately predict key functional sites? Due to the presence of a recently solved structure and a vast amount of experimental mutagenesis data, the neurotransmitter/Na+ symporter (NSS) family is an ideal model system to assess the quality of our predictions.

Results: The raw NSS sequence dataset contains 181 sequences, which have been aligned by various methods. The resultant phylogenetic trees always contain six major subfamilies are consistent with the functional diversity across the family. Moreover, in well-represented subfamilies, phylogenetic clustering recapitulates several nuanced functional distinctions. Functional sites are predicted using six different methods (phylogenetic motifs, two methods that identify subfamily-specific positions, and three different conservation scores). A canonical set of 34 functional sites identified by Yamashita et al. within the recently solved LeuTAa structure is used to assess the quality of the predictions, most of which are predicted by the bioinformatic methods. Remarkably, the importance of these sites is largely confirmed by experimental mutagenesis. Furthermore, the collective set of functional site predictions qualitatively clusters along the proposed transport pathway, further demonstrating their utility. Interestingly, the various prediction schemes provide results that are predominantly orthogonal to each other. However, when the methods do provide overlapping results, specificity is shown to increase dramatically (e.g., sites predicted by any three methods have both accuracy and coverage greater than 50%).

Conclusion: The results presented herein clearly establish the viability of sequence-based bioinformatic strategies to provide functional insight within the NSS family. As such, we expect similar bioinformatic investigations will streamline functional investigations within membrane integral families in the absence of structure.

Show MeSH

Related in: MedlinePlus

Chemical diversity of the osmolytes and biogenic amines. (a) All of the osmolytes are of similar size and are zwitterionic. Moreover, the separation of charge in each is nearly equal. (b) The biogenic amines (which are common neurotransmitters within the brain) are all aromatic amines. As one would expect based on the chemical diversity within the biogenic amines, there is an evolutionary split between the serotonin and catecholamine transporters.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194793&req=5

Figure 2: Chemical diversity of the osmolytes and biogenic amines. (a) All of the osmolytes are of similar size and are zwitterionic. Moreover, the separation of charge in each is nearly equal. (b) The biogenic amines (which are common neurotransmitters within the brain) are all aromatic amines. As one would expect based on the chemical diversity within the biogenic amines, there is an evolutionary split between the serotonin and catecholamine transporters.

Mentions: While there are significant chemical differences among the substrates across the entire NSS family, differences within subfamilies are greatly diminished. For example, all of the four known substrates within the osmolyte subfamily are of similar size and are zwitterionic (Figure 2a). Moreover, the spatial separation of charge within each is also fairly conserved. In order to extend the annotations beyond the known (experimental) descriptions, we employ a phylogenomics approach [30,31], where appropriate, to assign functional specificity to sequences without annotation. ORFans within otherwise obviously annotated out-groups are associated with the consensus annotation. For example, in the osmolyte subfamily, seven sequences are annotated as ORFans (arrows in Figure 3b). Five of those sequences occur within well-established out-groups. ConSequently, functional annotations are assigned here based on the other sequences within their respective out-group. The remaining two ORFans occur together, but do not fall into any obvious substrate distinction. As such, these two sequences remain unannotated (question marks in Figure 3a). Application of this approach to the entire NSS family increases the number of functional annotations by 12, which is significant, but not necessarily remarkable. However, the annotation improvement becomes significantly more impressive when investigating only the osmolyte and biogenic amine transporter subfamilies, both of which are better characterized experimentally [20,23,26,32-35]. In these two examples alone, ten of twelve ORFans can be functionally classified using phylogenetics. (Note: a complete list of all experimentally characterized sequences, ORFans, and the twelve newly annotated sequences is provided in Additional file 2.)


Assessing the ability of sequence-based methods to provide functional insight within membrane integral proteins: a case study analyzing the neurotransmitter/Na+ symporter family.

Livesay DR, Kidd PD, Eskandari S, Roshan U - BMC Bioinformatics (2007)

Chemical diversity of the osmolytes and biogenic amines. (a) All of the osmolytes are of similar size and are zwitterionic. Moreover, the separation of charge in each is nearly equal. (b) The biogenic amines (which are common neurotransmitters within the brain) are all aromatic amines. As one would expect based on the chemical diversity within the biogenic amines, there is an evolutionary split between the serotonin and catecholamine transporters.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194793&req=5

Figure 2: Chemical diversity of the osmolytes and biogenic amines. (a) All of the osmolytes are of similar size and are zwitterionic. Moreover, the separation of charge in each is nearly equal. (b) The biogenic amines (which are common neurotransmitters within the brain) are all aromatic amines. As one would expect based on the chemical diversity within the biogenic amines, there is an evolutionary split between the serotonin and catecholamine transporters.
Mentions: While there are significant chemical differences among the substrates across the entire NSS family, differences within subfamilies are greatly diminished. For example, all of the four known substrates within the osmolyte subfamily are of similar size and are zwitterionic (Figure 2a). Moreover, the spatial separation of charge within each is also fairly conserved. In order to extend the annotations beyond the known (experimental) descriptions, we employ a phylogenomics approach [30,31], where appropriate, to assign functional specificity to sequences without annotation. ORFans within otherwise obviously annotated out-groups are associated with the consensus annotation. For example, in the osmolyte subfamily, seven sequences are annotated as ORFans (arrows in Figure 3b). Five of those sequences occur within well-established out-groups. ConSequently, functional annotations are assigned here based on the other sequences within their respective out-group. The remaining two ORFans occur together, but do not fall into any obvious substrate distinction. As such, these two sequences remain unannotated (question marks in Figure 3a). Application of this approach to the entire NSS family increases the number of functional annotations by 12, which is significant, but not necessarily remarkable. However, the annotation improvement becomes significantly more impressive when investigating only the osmolyte and biogenic amine transporter subfamilies, both of which are better characterized experimentally [20,23,26,32-35]. In these two examples alone, ten of twelve ORFans can be functionally classified using phylogenetics. (Note: a complete list of all experimentally characterized sequences, ORFans, and the twelve newly annotated sequences is provided in Additional file 2.)

Bottom Line: Interestingly, the various prediction schemes provide results that are predominantly orthogonal to each other.However, when the methods do provide overlapping results, specificity is shown to increase dramatically (e.g., sites predicted by any three methods have both accuracy and coverage greater than 50%).The results presented herein clearly establish the viability of sequence-based bioinformatic strategies to provide functional insight within the NSS family.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Bioinformatics Research Center, University of North Carolina at Charlotte, Charlotte, NC 28262, USA. drlivesa@uncc.edu

ABSTRACT

Background: Efforts to predict functional sites from globular proteins is increasingly common; however, the most successful of these methods generally require structural insight. Unfortunately, despite several recent technological advances, structural coverage of membrane integral proteins continues to be sparse. ConSequently, sequence-based methods represent an important alternative to illuminate functional roles. In this report, we critically examine the ability of several computational methods to provide functional insight within two specific areas. First, can phylogenomic methods accurately describe the functional diversity across a membrane integral protein family? And second, can sequence-based strategies accurately predict key functional sites? Due to the presence of a recently solved structure and a vast amount of experimental mutagenesis data, the neurotransmitter/Na+ symporter (NSS) family is an ideal model system to assess the quality of our predictions.

Results: The raw NSS sequence dataset contains 181 sequences, which have been aligned by various methods. The resultant phylogenetic trees always contain six major subfamilies are consistent with the functional diversity across the family. Moreover, in well-represented subfamilies, phylogenetic clustering recapitulates several nuanced functional distinctions. Functional sites are predicted using six different methods (phylogenetic motifs, two methods that identify subfamily-specific positions, and three different conservation scores). A canonical set of 34 functional sites identified by Yamashita et al. within the recently solved LeuTAa structure is used to assess the quality of the predictions, most of which are predicted by the bioinformatic methods. Remarkably, the importance of these sites is largely confirmed by experimental mutagenesis. Furthermore, the collective set of functional site predictions qualitatively clusters along the proposed transport pathway, further demonstrating their utility. Interestingly, the various prediction schemes provide results that are predominantly orthogonal to each other. However, when the methods do provide overlapping results, specificity is shown to increase dramatically (e.g., sites predicted by any three methods have both accuracy and coverage greater than 50%).

Conclusion: The results presented herein clearly establish the viability of sequence-based bioinformatic strategies to provide functional insight within the NSS family. As such, we expect similar bioinformatic investigations will streamline functional investigations within membrane integral families in the absence of structure.

Show MeSH
Related in: MedlinePlus