Limits...
Using supernetworks to distinguish hybridization from lineage-sorting.

Holland BR, Benthin S, Lockhart PJ, Moulton V, Huber KT - BMC Evol. Biol. (2008)

Bottom Line: However, this approach may be confounded by factors such as poor taxon-sampling and/or incomplete lineage-sorting.For few hybridization events, a large number of independent loci, and well-sampled taxa across these loci, we found that it was possible to distinguish incomplete lineage-sorting from hybridization using the filtered Z-closure and Q-imputation supernetwork methods.Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that count-based filtering was the most effective filtering technique.

View Article: PubMed Central - HTML - PubMed

Affiliation: Allan Wilson Centre, Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand. b.r.holland@massey.ac.nz

ABSTRACT

Background: A simple and widely used approach for detecting hybridization in phylogenies is to reconstruct gene trees from independent gene loci, and to look for gene tree incongruence. However, this approach may be confounded by factors such as poor taxon-sampling and/or incomplete lineage-sorting.

Results: Using coalescent simulations, we investigated the potential of supernetwork methods to differentiate between gene tree incongruence arising from taxon sampling and incomplete lineage-sorting as opposed to hybridization. For few hybridization events, a large number of independent loci, and well-sampled taxa across these loci, we found that it was possible to distinguish incomplete lineage-sorting from hybridization using the filtered Z-closure and Q-imputation supernetwork methods. Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that count-based filtering was the most effective filtering technique.

Conclusion: Filtered supernetworks provide a tool for detecting and identifying hybridization events in phylogenies, a tool that should become increasingly useful in light of current genome sequencing initiatives and the ease with which large numbers of independent gene loci can be determined using new generation sequencing technologies.

Show MeSH
(A) A hybridization network (number 7 from Table 1) with two hybridization nodes. (B) The principal trees of the hybridization network – these are found by choosing a single parent at each hybridization node and then suppressing the resulting internal nodes of degree 2. (C) The splits associated with the hybridization network are those displayed by the principal trees in (B). (D) A split network displaying the splits in (C).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2500029&req=5

Figure 2: (A) A hybridization network (number 7 from Table 1) with two hybridization nodes. (B) The principal trees of the hybridization network – these are found by choosing a single parent at each hybridization node and then suppressing the resulting internal nodes of degree 2. (C) The splits associated with the hybridization network are those displayed by the principal trees in (B). (D) A split network displaying the splits in (C).

Mentions: The starting point for each simulation is a hybridization network such as the one shown in Figure 2a. Formally such networks are rooted, leaf-labelled, directed-acyclic-graphs in which the nodes are of one of four types: nodes with in-degree 2 and out-degree 1 correspond to hybridizations; nodes with in-degree 1 and out-degree 2 correspond to speciation events; nodes with in-degree 1 and out-degree 0 correspond to the extant species; and one special node of in-degree 0 and out-degree 2 is the root. Such a network can be thought of as a collection of rooted principal trees: These trees are obtained by starting from the tips of the hybridization network (these are the nodes with in-degree 1 and out-degree 0) and choosing one of the two possible paths at each hybridization node that is encountered on the way towards the root. The set of principal trees consists of the trees possible to obtain in this manner (Figure 2b). This leads to a natural definition of the collection of splits associated with a hybridization network as being the union of the splits associated with each of the principal trees of the network (Figure 2c and 2d). We will refer to such splits as the true splits of the hybridization network. The purpose of the simulations is to assess if filtered supernetworks can identify the splits present in the principal trees of the hybridization network. To be successful these splits need to be distinguishable from those arising from incomplete lineage-sorting under the coalescent process.


Using supernetworks to distinguish hybridization from lineage-sorting.

Holland BR, Benthin S, Lockhart PJ, Moulton V, Huber KT - BMC Evol. Biol. (2008)

(A) A hybridization network (number 7 from Table 1) with two hybridization nodes. (B) The principal trees of the hybridization network – these are found by choosing a single parent at each hybridization node and then suppressing the resulting internal nodes of degree 2. (C) The splits associated with the hybridization network are those displayed by the principal trees in (B). (D) A split network displaying the splits in (C).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2500029&req=5

Figure 2: (A) A hybridization network (number 7 from Table 1) with two hybridization nodes. (B) The principal trees of the hybridization network – these are found by choosing a single parent at each hybridization node and then suppressing the resulting internal nodes of degree 2. (C) The splits associated with the hybridization network are those displayed by the principal trees in (B). (D) A split network displaying the splits in (C).
Mentions: The starting point for each simulation is a hybridization network such as the one shown in Figure 2a. Formally such networks are rooted, leaf-labelled, directed-acyclic-graphs in which the nodes are of one of four types: nodes with in-degree 2 and out-degree 1 correspond to hybridizations; nodes with in-degree 1 and out-degree 2 correspond to speciation events; nodes with in-degree 1 and out-degree 0 correspond to the extant species; and one special node of in-degree 0 and out-degree 2 is the root. Such a network can be thought of as a collection of rooted principal trees: These trees are obtained by starting from the tips of the hybridization network (these are the nodes with in-degree 1 and out-degree 0) and choosing one of the two possible paths at each hybridization node that is encountered on the way towards the root. The set of principal trees consists of the trees possible to obtain in this manner (Figure 2b). This leads to a natural definition of the collection of splits associated with a hybridization network as being the union of the splits associated with each of the principal trees of the network (Figure 2c and 2d). We will refer to such splits as the true splits of the hybridization network. The purpose of the simulations is to assess if filtered supernetworks can identify the splits present in the principal trees of the hybridization network. To be successful these splits need to be distinguishable from those arising from incomplete lineage-sorting under the coalescent process.

Bottom Line: However, this approach may be confounded by factors such as poor taxon-sampling and/or incomplete lineage-sorting.For few hybridization events, a large number of independent loci, and well-sampled taxa across these loci, we found that it was possible to distinguish incomplete lineage-sorting from hybridization using the filtered Z-closure and Q-imputation supernetwork methods.Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that count-based filtering was the most effective filtering technique.

View Article: PubMed Central - HTML - PubMed

Affiliation: Allan Wilson Centre, Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand. b.r.holland@massey.ac.nz

ABSTRACT

Background: A simple and widely used approach for detecting hybridization in phylogenies is to reconstruct gene trees from independent gene loci, and to look for gene tree incongruence. However, this approach may be confounded by factors such as poor taxon-sampling and/or incomplete lineage-sorting.

Results: Using coalescent simulations, we investigated the potential of supernetwork methods to differentiate between gene tree incongruence arising from taxon sampling and incomplete lineage-sorting as opposed to hybridization. For few hybridization events, a large number of independent loci, and well-sampled taxa across these loci, we found that it was possible to distinguish incomplete lineage-sorting from hybridization using the filtered Z-closure and Q-imputation supernetwork methods. Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that count-based filtering was the most effective filtering technique.

Conclusion: Filtered supernetworks provide a tool for detecting and identifying hybridization events in phylogenies, a tool that should become increasingly useful in light of current genome sequencing initiatives and the ease with which large numbers of independent gene loci can be determined using new generation sequencing technologies.

Show MeSH