Limits...
Transferring functional annotations of membrane transporters on the basis of sequence similarity and sequence motifs.

Barghash A, Helms V - BMC Bioinformatics (2013)

Bottom Line: At similar identity thresholds, the nature of the transported substrates was more divergent (F-measure 40--75% at the same thresholds) than the TC family membership.Researchers who wish to apply these thresholds in their studies should multiply these thresholds by the size of the database they search against.Our findings should be useful to those who wish to transfer transporter functional annotations across species.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Bioinformatics, Saarland University, Postfach 15 11 50, 66041 Saarbrücken, Germany. volkhard.helms@bioinformatik.uni-saarland.de.

ABSTRACT

Background: Membrane transporters catalyze the transport of small solute molecules across biological barriers such as lipid bilayer membranes. Experimental identification of the transported substrates is very tedious. Once a particular transport mechanism has been identified in one organism, it is thus highly desirable to transfer this information to related transporter sequences in different organisms based on bioinformatics evidence.

Results: We present a thorough benchmark at which level of sequence identity membrane transporters from Escherichia coli, Saccharomyces cerevisiae, and Arabidopsis thaliana belong to the same families of the Transporter Classification (TC) system, and at what level these membrane transporters mediate the transport of the same substrate. We found that two membrane transporter sequences from different organisms that are aligned with normalized BLAST expectation value better than E-value 1e-8 are highly likely to belong to the same TC family (F-measure around 90%). Enriched sequence motifs identified by MEME at thresholds below 1e-12 support accurate classification into TC families for about two thirds of the sequences (F-measure 80% and higher). For the comparison of transported substrates, we focused on the four largest substrate classes of amino acids, sugars, metal ions, and phosphate. At similar identity thresholds, the nature of the transported substrates was more divergent (F-measure 40--75% at the same thresholds) than the TC family membership.

Conclusions: We suggest an acceptable threshold of 1e-8 for BLAST and HMMER where at least three quarters of the sequences are classified according to the TC system with a reasonably high accuracy. Researchers who wish to apply these thresholds in their studies should multiply these thresholds by the size of the database they search against. Our findings should be useful to those who wish to transfer transporter functional annotations across species.

Show MeSH

Related in: MedlinePlus

Distribution of transporters among the TC families. Common Ec, At, and Sc TC families with member counts. Most families belong to the Electrochemical Potential Driven Transporters (class 2) and the Primary Active Transporters TC classes (class 3). Shared TC families in the searched organism with more than 2 members were used for MEME motif analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4219331&req=5

Figure 1: Distribution of transporters among the TC families. Common Ec, At, and Sc TC families with member counts. Most families belong to the Electrochemical Potential Driven Transporters (class 2) and the Primary Active Transporters TC classes (class 3). Shared TC families in the searched organism with more than 2 members were used for MEME motif analysis.

Mentions: In this work, we perform functional classification of transporter TC families and of transported substrate molecule using datasets from three model organisms. Our aim is to provide a simple guideline to biologists who wish to get a quick information whether available functional information about a transporter in species X may be transferred to another transporter sequence identified e.g. by BLAST search in species Y. Table 1 provides an overview over the main data sets used in this work. Figure 1 lists common TC families between the three organisms and the distribution of transporters among them. Additional file 1: Tables S1-S3 list all used transporters in this study, their TC families, substrates, and their Pfam description.


Transferring functional annotations of membrane transporters on the basis of sequence similarity and sequence motifs.

Barghash A, Helms V - BMC Bioinformatics (2013)

Distribution of transporters among the TC families. Common Ec, At, and Sc TC families with member counts. Most families belong to the Electrochemical Potential Driven Transporters (class 2) and the Primary Active Transporters TC classes (class 3). Shared TC families in the searched organism with more than 2 members were used for MEME motif analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4219331&req=5

Figure 1: Distribution of transporters among the TC families. Common Ec, At, and Sc TC families with member counts. Most families belong to the Electrochemical Potential Driven Transporters (class 2) and the Primary Active Transporters TC classes (class 3). Shared TC families in the searched organism with more than 2 members were used for MEME motif analysis.
Mentions: In this work, we perform functional classification of transporter TC families and of transported substrate molecule using datasets from three model organisms. Our aim is to provide a simple guideline to biologists who wish to get a quick information whether available functional information about a transporter in species X may be transferred to another transporter sequence identified e.g. by BLAST search in species Y. Table 1 provides an overview over the main data sets used in this work. Figure 1 lists common TC families between the three organisms and the distribution of transporters among them. Additional file 1: Tables S1-S3 list all used transporters in this study, their TC families, substrates, and their Pfam description.

Bottom Line: At similar identity thresholds, the nature of the transported substrates was more divergent (F-measure 40--75% at the same thresholds) than the TC family membership.Researchers who wish to apply these thresholds in their studies should multiply these thresholds by the size of the database they search against.Our findings should be useful to those who wish to transfer transporter functional annotations across species.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Bioinformatics, Saarland University, Postfach 15 11 50, 66041 Saarbrücken, Germany. volkhard.helms@bioinformatik.uni-saarland.de.

ABSTRACT

Background: Membrane transporters catalyze the transport of small solute molecules across biological barriers such as lipid bilayer membranes. Experimental identification of the transported substrates is very tedious. Once a particular transport mechanism has been identified in one organism, it is thus highly desirable to transfer this information to related transporter sequences in different organisms based on bioinformatics evidence.

Results: We present a thorough benchmark at which level of sequence identity membrane transporters from Escherichia coli, Saccharomyces cerevisiae, and Arabidopsis thaliana belong to the same families of the Transporter Classification (TC) system, and at what level these membrane transporters mediate the transport of the same substrate. We found that two membrane transporter sequences from different organisms that are aligned with normalized BLAST expectation value better than E-value 1e-8 are highly likely to belong to the same TC family (F-measure around 90%). Enriched sequence motifs identified by MEME at thresholds below 1e-12 support accurate classification into TC families for about two thirds of the sequences (F-measure 80% and higher). For the comparison of transported substrates, we focused on the four largest substrate classes of amino acids, sugars, metal ions, and phosphate. At similar identity thresholds, the nature of the transported substrates was more divergent (F-measure 40--75% at the same thresholds) than the TC family membership.

Conclusions: We suggest an acceptable threshold of 1e-8 for BLAST and HMMER where at least three quarters of the sequences are classified according to the TC system with a reasonably high accuracy. Researchers who wish to apply these thresholds in their studies should multiply these thresholds by the size of the database they search against. Our findings should be useful to those who wish to transfer transporter functional annotations across species.

Show MeSH
Related in: MedlinePlus