Limits...
MDAT- Aligning multiple domain arrangements.

Kemena C, Bitard-Feildel T, Bornberg-Bauer E - BMC Bioinformatics (2015)

Bottom Line: We developed an alignment program, called MDAT, which aligns multiple domain arrangements.MDAT extends earlier programs which perform pairwise alignments of domain arrangements.MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains.

View Article: PubMed Central - PubMed

Affiliation: Institute for Evolution and Biodiversity, University of Münster, Hüfferstr. 1, Münster, Germany. c.kemena@uni-muenster.de.

ABSTRACT

Background: Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments.

Results: We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method.

Conclusion: MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat .

Show MeSH
Domain similarity score distribution. The scores were calculated by HHsearch, for all pairwise alignment scores of Pfam-A domains (version 27). The values have been divided into two groups depending on whether the two domains belonging to the same clan or not (different or no clan). Values of self alignments are not included.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4384290&req=5

Fig1: Domain similarity score distribution. The scores were calculated by HHsearch, for all pairwise alignment scores of Pfam-A domains (version 27). The values have been divided into two groups depending on whether the two domains belonging to the same clan or not (different or no clan). Values of self alignments are not included.

Mentions: The Pfam database provides some rough information on domain homology (the “clans”) based on a range of various information evaluated manually [19,20]. Unfortunately, this information cannot be used in an alignment program as clan information is binary only. Therefore, one cannot use clan information reliably to distinguish which domains to match if several possibilities to align a set of domains exits. Another drawback of using clans is that currently only about one third of the almost 15,000 domains in Pfam are associated to a clan. To avoid these drawbacks, we decided to calculate the domain similarity matrix. Figure 1 displays the distribution of match probabilities for each domain pair and how this value relates to being in the same clan or not.Figure 1


MDAT- Aligning multiple domain arrangements.

Kemena C, Bitard-Feildel T, Bornberg-Bauer E - BMC Bioinformatics (2015)

Domain similarity score distribution. The scores were calculated by HHsearch, for all pairwise alignment scores of Pfam-A domains (version 27). The values have been divided into two groups depending on whether the two domains belonging to the same clan or not (different or no clan). Values of self alignments are not included.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4384290&req=5

Fig1: Domain similarity score distribution. The scores were calculated by HHsearch, for all pairwise alignment scores of Pfam-A domains (version 27). The values have been divided into two groups depending on whether the two domains belonging to the same clan or not (different or no clan). Values of self alignments are not included.
Mentions: The Pfam database provides some rough information on domain homology (the “clans”) based on a range of various information evaluated manually [19,20]. Unfortunately, this information cannot be used in an alignment program as clan information is binary only. Therefore, one cannot use clan information reliably to distinguish which domains to match if several possibilities to align a set of domains exits. Another drawback of using clans is that currently only about one third of the almost 15,000 domains in Pfam are associated to a clan. To avoid these drawbacks, we decided to calculate the domain similarity matrix. Figure 1 displays the distribution of match probabilities for each domain pair and how this value relates to being in the same clan or not.Figure 1

Bottom Line: We developed an alignment program, called MDAT, which aligns multiple domain arrangements.MDAT extends earlier programs which perform pairwise alignments of domain arrangements.MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains.

View Article: PubMed Central - PubMed

Affiliation: Institute for Evolution and Biodiversity, University of Münster, Hüfferstr. 1, Münster, Germany. c.kemena@uni-muenster.de.

ABSTRACT

Background: Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments.

Results: We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method.

Conclusion: MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat .

Show MeSH