Limits...
Assigning and visualizing germline genes in antibody repertoires.

Frost SD, Murrell B, Hossain AS, Silverman GJ, Pond SL - Philos. Trans. R. Soc. Lond., B, Biol. Sci. (2015)

Bottom Line: We also develop an interactive web application for viewing the results, allowing the user to explore the frequency distribution of sequence assignments and CDR3 region length statistics, which is useful for summarizing repertoires, as well as a detailed viewer of rearrangements and region alignments for individual query sequences.We demonstrate the accuracy and utility of our method compared with sequence similarity-based approaches and other non-phylogenetic model-based approaches, using both simulated data and a set of evaluation datasets of human immunoglobulin heavy chain sequences.IgSCUEAL demonstrates the highest accuracy of V and J assignment amongst existing approaches, even when the reassorted sequence is highly mutated, and can successfully cluster sequences on the basis of shared V/J germline alleles.

View Article: PubMed Central - PubMed

Affiliation: Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, Cambridgeshire CB3 0ES, UK.

ABSTRACT
Identifying the germline genes involved in immunoglobulin rearrangements is an essential first step in the analysis of antibody repertoires. Based on our prior work in analysing diverse recombinant viruses, we present IgSCUEAL (Immunoglobulin Subtype Classification Using Evolutionary ALgorithms), a phylogenetic approach to assign V and J regions of immunoglobulin sequences to their corresponding germline alleles, with D regions assigned using a simple pairwise alignment algorithm. We also develop an interactive web application for viewing the results, allowing the user to explore the frequency distribution of sequence assignments and CDR3 region length statistics, which is useful for summarizing repertoires, as well as a detailed viewer of rearrangements and region alignments for individual query sequences. We demonstrate the accuracy and utility of our method compared with sequence similarity-based approaches and other non-phylogenetic model-based approaches, using both simulated data and a set of evaluation datasets of human immunoglobulin heavy chain sequences. IgSCUEAL demonstrates the highest accuracy of V and J assignment amongst existing approaches, even when the reassorted sequence is highly mutated, and can successfully cluster sequences on the basis of shared V/J germline alleles.

No MeSH data available.


Related in: MedlinePlus

Phylogenetic trees of clonally related IGH sequences, reconstructed by maximum likelihood, and rooted on the centre of the tree (see Material and methods §2g(iii)). These illustrate the high level of genetic diversity in these datasets (13.4%, 11.7% and 12.0% for (a), (b) and (c), respectively), as well as the variable divergence from the root sequence, despite all sequences in a dataset being sampled at the same time. These trees do not illustrate the level of divergence of these sequences from their germline genes (figure 4).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4528417&req=5

RSTB20140240F3: Phylogenetic trees of clonally related IGH sequences, reconstructed by maximum likelihood, and rooted on the centre of the tree (see Material and methods §2g(iii)). These illustrate the high level of genetic diversity in these datasets (13.4%, 11.7% and 12.0% for (a), (b) and (c), respectively), as well as the variable divergence from the root sequence, despite all sequences in a dataset being sampled at the same time. These trees do not illustrate the level of divergence of these sequences from their germline genes (figure 4).

Mentions: We analysed clonal datasets in order to assess the consistency of assignment for a given clone. Ideally, the different sequences within a clone should share the same V(D)J assignment. As IgSCUEAL generates Akaike weights for a given assignment, assignments for a clone can be combined over the sequences. In addition, we clustered sequences together on the basis of shared V and J alleles in their credible set. Although the true rearrangement is not known with certainty, we also generated a predicted rearrangement for an ancestral reconstruction of the sequences (rooted at the centre of the tree). Figure 3 illustrates that these clonal datasets exhibit significant diversity (ca 12% mean pairwise distance, as calculated from the branch lengths of the phylogeny). Hence, using an ancestral reconstruction may help to reduce noise by removing at least some of the somatic hypermutations.Figure 3.


Assigning and visualizing germline genes in antibody repertoires.

Frost SD, Murrell B, Hossain AS, Silverman GJ, Pond SL - Philos. Trans. R. Soc. Lond., B, Biol. Sci. (2015)

Phylogenetic trees of clonally related IGH sequences, reconstructed by maximum likelihood, and rooted on the centre of the tree (see Material and methods §2g(iii)). These illustrate the high level of genetic diversity in these datasets (13.4%, 11.7% and 12.0% for (a), (b) and (c), respectively), as well as the variable divergence from the root sequence, despite all sequences in a dataset being sampled at the same time. These trees do not illustrate the level of divergence of these sequences from their germline genes (figure 4).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4528417&req=5

RSTB20140240F3: Phylogenetic trees of clonally related IGH sequences, reconstructed by maximum likelihood, and rooted on the centre of the tree (see Material and methods §2g(iii)). These illustrate the high level of genetic diversity in these datasets (13.4%, 11.7% and 12.0% for (a), (b) and (c), respectively), as well as the variable divergence from the root sequence, despite all sequences in a dataset being sampled at the same time. These trees do not illustrate the level of divergence of these sequences from their germline genes (figure 4).
Mentions: We analysed clonal datasets in order to assess the consistency of assignment for a given clone. Ideally, the different sequences within a clone should share the same V(D)J assignment. As IgSCUEAL generates Akaike weights for a given assignment, assignments for a clone can be combined over the sequences. In addition, we clustered sequences together on the basis of shared V and J alleles in their credible set. Although the true rearrangement is not known with certainty, we also generated a predicted rearrangement for an ancestral reconstruction of the sequences (rooted at the centre of the tree). Figure 3 illustrates that these clonal datasets exhibit significant diversity (ca 12% mean pairwise distance, as calculated from the branch lengths of the phylogeny). Hence, using an ancestral reconstruction may help to reduce noise by removing at least some of the somatic hypermutations.Figure 3.

Bottom Line: We also develop an interactive web application for viewing the results, allowing the user to explore the frequency distribution of sequence assignments and CDR3 region length statistics, which is useful for summarizing repertoires, as well as a detailed viewer of rearrangements and region alignments for individual query sequences.We demonstrate the accuracy and utility of our method compared with sequence similarity-based approaches and other non-phylogenetic model-based approaches, using both simulated data and a set of evaluation datasets of human immunoglobulin heavy chain sequences.IgSCUEAL demonstrates the highest accuracy of V and J assignment amongst existing approaches, even when the reassorted sequence is highly mutated, and can successfully cluster sequences on the basis of shared V/J germline alleles.

View Article: PubMed Central - PubMed

Affiliation: Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, Cambridgeshire CB3 0ES, UK.

ABSTRACT
Identifying the germline genes involved in immunoglobulin rearrangements is an essential first step in the analysis of antibody repertoires. Based on our prior work in analysing diverse recombinant viruses, we present IgSCUEAL (Immunoglobulin Subtype Classification Using Evolutionary ALgorithms), a phylogenetic approach to assign V and J regions of immunoglobulin sequences to their corresponding germline alleles, with D regions assigned using a simple pairwise alignment algorithm. We also develop an interactive web application for viewing the results, allowing the user to explore the frequency distribution of sequence assignments and CDR3 region length statistics, which is useful for summarizing repertoires, as well as a detailed viewer of rearrangements and region alignments for individual query sequences. We demonstrate the accuracy and utility of our method compared with sequence similarity-based approaches and other non-phylogenetic model-based approaches, using both simulated data and a set of evaluation datasets of human immunoglobulin heavy chain sequences. IgSCUEAL demonstrates the highest accuracy of V and J assignment amongst existing approaches, even when the reassorted sequence is highly mutated, and can successfully cluster sequences on the basis of shared V/J germline alleles.

No MeSH data available.


Related in: MedlinePlus