Limits...
The phylogeny of the mammalian heme peroxidases and the evolution of their diverse functions.

Loughran NB, O'Connor B, O'Fágáin C, O'Connell MJ - BMC Evol. Biol. (2008)

Bottom Line: Despite much effort to elucidate a clearer understanding of the function of the 4 major groups of this multigene family, we still do not have a clear understanding of their relationships to each other.We demonstrate, using a root mean squared deviation statistic, how the removal of the fastest evolving sites aids in the minimisation of the effect of long branch attraction and the generation of a highly supported phylogeny.Our study has (i) fully resolved the phylogeny of the MHPs and the subsequent pattern of gene duplication, and (ii), we have detected amino acids under positive selection that have most likely contributed to the observed functional shifts in each type of MHP.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Glasnevin, Dublin 9, Ireland. noeleen.loughran@gmail.com

ABSTRACT

Background: The mammalian heme peroxidases (MHPs) are a medically important group of enzymes. Included in this group are myeloperoxidase, eosinophil peroxidase, lactoperoxidase, and thyroid peroxidase. These enzymes are associated with such diverse diseases as asthma, Alzheimer's disease and inflammatory vascular disease. Despite much effort to elucidate a clearer understanding of the function of the 4 major groups of this multigene family, we still do not have a clear understanding of their relationships to each other.

Results: Sufficient signal exists for the resolution of the evolutionary relationships of this family of enzymes. We demonstrate, using a root mean squared deviation statistic, how the removal of the fastest evolving sites aids in the minimisation of the effect of long branch attraction and the generation of a highly supported phylogeny. Based on this phylogeny we have pinpointed the amino acid positions that have most likely contributed to the diverse functions of these enzymes. Many of these residues are in close proximity to sites implicated in protein misfolding, loss of function or disease.

Conclusion: Our analysis of all available genomic sequence data for the MHPs from all available completed mammalian genomes, involved sophisticated methods of phylogeny reconstruction and data treatment. Our study has (i) fully resolved the phylogeny of the MHPs and the subsequent pattern of gene duplication, and (ii), we have detected amino acids under positive selection that have most likely contributed to the observed functional shifts in each type of MHP.

Show MeSH

Related in: MedlinePlus

Fully resolved mammalian heme peroxidase phylogeny with duplication and loss events depicted. (a) Resolved ML tree for mammalian heme peroxidases. The bootstrap support values from 1000 replicates are shown on all nodes. The TPO primate clade appears here as a polytomy as the branch lengths are extremely short, however, this is in fact resolved with a low Bootstrap of 56%. The star symbol denotes those branches that were treated as foreground in the selection analysis. (b) The analysis of the resolved phylogeny using gene tree species tree reconciliation method implemented in GeneTree. The large filled circles represent gene duplication events, and the red branches indicate gene losses.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2315650&req=5

Figure 3: Fully resolved mammalian heme peroxidase phylogeny with duplication and loss events depicted. (a) Resolved ML tree for mammalian heme peroxidases. The bootstrap support values from 1000 replicates are shown on all nodes. The TPO primate clade appears here as a polytomy as the branch lengths are extremely short, however, this is in fact resolved with a low Bootstrap of 56%. The star symbol denotes those branches that were treated as foreground in the selection analysis. (b) The analysis of the resolved phylogeny using gene tree species tree reconciliation method implemented in GeneTree. The large filled circles represent gene duplication events, and the red branches indicate gene losses.

Mentions: We adapted the site stripping method using the slow-evolving positions for each species in the MSA to reconstruct the phylogeny, while still retaining adequate amounts of signal [29]. This approach is similar to the 'Slow-Fast Method' [37] and is therefore an approximate method that removes noise from the data by removing those sites that are most likely to contain homoplasy and focusing on the more evolutionary informative positions for phylogeny reconstruction. Each site within the MSA was classified according to rates of evolution (estimated using ML based on a fixed phylogenetic tree). To determine what number of categories to remove, we progressively stripped each category from the most rapidly evolving sites to the most slowly across the entire MSA. We also combined removal of the fastest and slowest sites from the dataset in our analysis, this was initially performed with the PXDN data included, see Figure 1b. Each time a category was removed the phylogenetic tree was estimated from the remaining MSA using ML. The ideal tree was created by pruning the mammalian supertree as published by Murphy et al. [33] (with the inclusion of chicken) and is depicted in Figure 1a. The difference between each site-stripped phylogeny and the ideal phylogeny was calculated using a nodal distance calculation RMSD [38], see Figure 1b. From Figure 1b, it is seen that the removal of rapidly evolving sites gradually removes the noise from the data and the remaining signal moves towards the canonical species phylogeny [33]. For the dataset consisting of MHPs and PXDN sequences, the RMSD value reaches a minimum at the removal of 4 site categories (8, 7, 6 and 5) leaving a MSA of length 850 sites (including gaps/missing data), see Figure 2b for resultant topology, after this point the RMSD values rise, see Figure 1b. It is important to note that the slowest evolving positions can be misleading particularly with excessive removal of sites, as the number of characters for reconstruction will decrease with every cycle, therefore caution must be taken in applying this method. This analysis was also performed on the dataset containing only MHP sequences, and the RMSD value reaches a minimum at the removal of 3 site categories (8,7, and 6) leaving a MSA of length 613 sites (including gaps/missing data), see Figure 3a for resultant topology. The reduced MSA for MHP data is given in Additional file 1 and the corresponding TOPD results are given in Additional file 2. The nodal distance (RMSD) calculation is based entirely on the branching pattern and hence does not account for evolutionary rate variation across the phylogeny. Using this site-stripped MSA the phylogeny was estimated using both MrBayes and MultiPhyl methods, both of which produced identical phylogenies*. (*We note here that the one exception, using the Bayesian reconstruction method, was the TPO primate monophyly was not fully resolved in the TPO clade but instead supported a human-chimp-macaque polytomy.)


The phylogeny of the mammalian heme peroxidases and the evolution of their diverse functions.

Loughran NB, O'Connor B, O'Fágáin C, O'Connell MJ - BMC Evol. Biol. (2008)

Fully resolved mammalian heme peroxidase phylogeny with duplication and loss events depicted. (a) Resolved ML tree for mammalian heme peroxidases. The bootstrap support values from 1000 replicates are shown on all nodes. The TPO primate clade appears here as a polytomy as the branch lengths are extremely short, however, this is in fact resolved with a low Bootstrap of 56%. The star symbol denotes those branches that were treated as foreground in the selection analysis. (b) The analysis of the resolved phylogeny using gene tree species tree reconciliation method implemented in GeneTree. The large filled circles represent gene duplication events, and the red branches indicate gene losses.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2315650&req=5

Figure 3: Fully resolved mammalian heme peroxidase phylogeny with duplication and loss events depicted. (a) Resolved ML tree for mammalian heme peroxidases. The bootstrap support values from 1000 replicates are shown on all nodes. The TPO primate clade appears here as a polytomy as the branch lengths are extremely short, however, this is in fact resolved with a low Bootstrap of 56%. The star symbol denotes those branches that were treated as foreground in the selection analysis. (b) The analysis of the resolved phylogeny using gene tree species tree reconciliation method implemented in GeneTree. The large filled circles represent gene duplication events, and the red branches indicate gene losses.
Mentions: We adapted the site stripping method using the slow-evolving positions for each species in the MSA to reconstruct the phylogeny, while still retaining adequate amounts of signal [29]. This approach is similar to the 'Slow-Fast Method' [37] and is therefore an approximate method that removes noise from the data by removing those sites that are most likely to contain homoplasy and focusing on the more evolutionary informative positions for phylogeny reconstruction. Each site within the MSA was classified according to rates of evolution (estimated using ML based on a fixed phylogenetic tree). To determine what number of categories to remove, we progressively stripped each category from the most rapidly evolving sites to the most slowly across the entire MSA. We also combined removal of the fastest and slowest sites from the dataset in our analysis, this was initially performed with the PXDN data included, see Figure 1b. Each time a category was removed the phylogenetic tree was estimated from the remaining MSA using ML. The ideal tree was created by pruning the mammalian supertree as published by Murphy et al. [33] (with the inclusion of chicken) and is depicted in Figure 1a. The difference between each site-stripped phylogeny and the ideal phylogeny was calculated using a nodal distance calculation RMSD [38], see Figure 1b. From Figure 1b, it is seen that the removal of rapidly evolving sites gradually removes the noise from the data and the remaining signal moves towards the canonical species phylogeny [33]. For the dataset consisting of MHPs and PXDN sequences, the RMSD value reaches a minimum at the removal of 4 site categories (8, 7, 6 and 5) leaving a MSA of length 850 sites (including gaps/missing data), see Figure 2b for resultant topology, after this point the RMSD values rise, see Figure 1b. It is important to note that the slowest evolving positions can be misleading particularly with excessive removal of sites, as the number of characters for reconstruction will decrease with every cycle, therefore caution must be taken in applying this method. This analysis was also performed on the dataset containing only MHP sequences, and the RMSD value reaches a minimum at the removal of 3 site categories (8,7, and 6) leaving a MSA of length 613 sites (including gaps/missing data), see Figure 3a for resultant topology. The reduced MSA for MHP data is given in Additional file 1 and the corresponding TOPD results are given in Additional file 2. The nodal distance (RMSD) calculation is based entirely on the branching pattern and hence does not account for evolutionary rate variation across the phylogeny. Using this site-stripped MSA the phylogeny was estimated using both MrBayes and MultiPhyl methods, both of which produced identical phylogenies*. (*We note here that the one exception, using the Bayesian reconstruction method, was the TPO primate monophyly was not fully resolved in the TPO clade but instead supported a human-chimp-macaque polytomy.)

Bottom Line: Despite much effort to elucidate a clearer understanding of the function of the 4 major groups of this multigene family, we still do not have a clear understanding of their relationships to each other.We demonstrate, using a root mean squared deviation statistic, how the removal of the fastest evolving sites aids in the minimisation of the effect of long branch attraction and the generation of a highly supported phylogeny.Our study has (i) fully resolved the phylogeny of the MHPs and the subsequent pattern of gene duplication, and (ii), we have detected amino acids under positive selection that have most likely contributed to the observed functional shifts in each type of MHP.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Glasnevin, Dublin 9, Ireland. noeleen.loughran@gmail.com

ABSTRACT

Background: The mammalian heme peroxidases (MHPs) are a medically important group of enzymes. Included in this group are myeloperoxidase, eosinophil peroxidase, lactoperoxidase, and thyroid peroxidase. These enzymes are associated with such diverse diseases as asthma, Alzheimer's disease and inflammatory vascular disease. Despite much effort to elucidate a clearer understanding of the function of the 4 major groups of this multigene family, we still do not have a clear understanding of their relationships to each other.

Results: Sufficient signal exists for the resolution of the evolutionary relationships of this family of enzymes. We demonstrate, using a root mean squared deviation statistic, how the removal of the fastest evolving sites aids in the minimisation of the effect of long branch attraction and the generation of a highly supported phylogeny. Based on this phylogeny we have pinpointed the amino acid positions that have most likely contributed to the diverse functions of these enzymes. Many of these residues are in close proximity to sites implicated in protein misfolding, loss of function or disease.

Conclusion: Our analysis of all available genomic sequence data for the MHPs from all available completed mammalian genomes, involved sophisticated methods of phylogeny reconstruction and data treatment. Our study has (i) fully resolved the phylogeny of the MHPs and the subsequent pattern of gene duplication, and (ii), we have detected amino acids under positive selection that have most likely contributed to the observed functional shifts in each type of MHP.

Show MeSH
Related in: MedlinePlus