Limits...
Base composition, selection, and phylogenetic significance of indels in the recombination activating gene-1 in vertebrates.

Chiari Y, van der Meijden A, Madsen O, Vences M, Meyer A - Front. Zool. (2009)

Bottom Line: Among the nuclear markers currently used for phylogenetic purposes, Rag1 has especially enjoyed enormous popularity, since it successfully contributed to elucidating the relationships among and within a large variety of vertebrate lineages.This result is also paralleled by taxonomic differences in the GC content at the same codon position.However, in some vertebrate lineages the 5'-end of the gene is not yet widely used for phylogenetic studies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany.

ABSTRACT

Background: The Recombination Activating Proteins, RAG1 and RAG2, play a crucial role in the immune response in vertebrates. Among the nuclear markers currently used for phylogenetic purposes, Rag1 has especially enjoyed enormous popularity, since it successfully contributed to elucidating the relationships among and within a large variety of vertebrate lineages. We here report on a comparative investigation of the genetic variation, base composition, presence of indels, and selection in Rag1 in different vertebrate lineages (Actinopterygii, Amphibia, Aves, Chondrichthyes, Crocodylia, Lepidosauria, Mammalia, and Testudines) through the analysis of 582 sequences obtained from Genbank. We also analyze possible differences between distinct parts of the gene with different type of protein functions.

Results: In the vertebrate lineages studied, Rag1 is over 3 kb long. We observed a high level of heterogeneity in base composition at the 3(rd )codon position in some of the studied vertebrate lineages and in some specific taxa. This result is also paralleled by taxonomic differences in the GC content at the same codon position. Moreover, positive selection occurs at some sites in Aves, Lepidosauria and Testudines. Indels, which are often used as phylogenetic characters, are more informative across vertebrates in the 5' than in the 3'-end of the gene. When the entire gene is considered, the use of indels as phylogenetic character only recovers one major vertebrate clade, the Actinopterygii. However, in numerous cases insertions or deletions are specific to a monophyletic group.

Conclusions: Rag1 is a phylogenetic marker of undoubted quality. Our study points to the need of carrying out a preliminary investigation on the base composition and the possible existence of sites under selection of this gene within the groups studied to avoid misleading resolution. The gene shows highly heterogeneous base composition, which affects some taxa in particular and contains sites under positive selection in some vertebrate lineages in the 5'-end. The first part of the gene (5'-end) is more variable than the second (3'-end), and less affected by a heterogeneous base composition. However, in some vertebrate lineages the 5'-end of the gene is not yet widely used for phylogenetic studies.

No MeSH data available.


Organization of RAG1 in Mus musculus. Organization of RAG1 (based on Mus musculus, P15919 of http://www.uniprot.org/[26]). The vertical line indicates the division in 5' and 3'-ends used in this paper. The analyzed zinc finger and DNA binding domains are indicated. Numbers indicate the amino acid positions in Mus musculus RAG1 sequence. "ZDD" indicates the zinc binding domain (which contains the zinc finger domain analyzed in this paper). "NBR" indicates the nonamer binding domain (which corresponds to the DNA binding domain analyzed in this paper). "Core region" indicates the region of the protein needed for cleavage activity (see [17] for additional information).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2803162&req=5

Figure 1: Organization of RAG1 in Mus musculus. Organization of RAG1 (based on Mus musculus, P15919 of http://www.uniprot.org/[26]). The vertical line indicates the division in 5' and 3'-ends used in this paper. The analyzed zinc finger and DNA binding domains are indicated. Numbers indicate the amino acid positions in Mus musculus RAG1 sequence. "ZDD" indicates the zinc binding domain (which contains the zinc finger domain analyzed in this paper). "NBR" indicates the nonamer binding domain (which corresponds to the DNA binding domain analyzed in this paper). "Core region" indicates the region of the protein needed for cleavage activity (see [17] for additional information).

Mentions: The protein products of the two lymphocyte-specific recombination activating genes, Rag1 and Rag2, play an essential role in the host's active immune response to the different pathogens (see [17] and references therein for specific different activity of each protein in the immunological response), starting the process that generates specific receptors on B and T lymphocytes. The immune system is able to target and destroy many different foreign invaders as a result of the vast number of these specific receptors. The specificity of these receptors is made possible by a process known as V(D)J joining. This mechanism occurs in vertebrates and relies on the shuffling and recombination of different pre-existing gene fragments (V (variable), J (joining) and in some case D (diversity)) [17]. The first step of this set of reactions is the recognition and cleavage of a well conserved Recombination Signal Sequence (RSS), consisting of seven or nine nucleotide sequences separated from each other by a spacer of 12 or 23 bp [18]. The Rag1 coding sequence contains a conserved protein structural domain that binds the RSS [19]. The active site for the RSS binding and DNA cleavage is contained in part of the so-called the "core RAG1 domain", which also contains the nonamer-binding region (NBR, Figure 1) and three active residues. The recombination process requires that the RAG1 and RAG2 proteins act together as a heterodimer to recognize the RSS (reviewed in [20]) and introduce a break between the RSS and the adjacent V(D)J coding segments [21]. Almost the entire length of RAG1 is involved in codifying for different protein functions (e.g., sites involving in binding RAG2, sites involving in binding the RSS, Figure 1 and [17,22] for additional information on RAG1 structure).


Base composition, selection, and phylogenetic significance of indels in the recombination activating gene-1 in vertebrates.

Chiari Y, van der Meijden A, Madsen O, Vences M, Meyer A - Front. Zool. (2009)

Organization of RAG1 in Mus musculus. Organization of RAG1 (based on Mus musculus, P15919 of http://www.uniprot.org/[26]). The vertical line indicates the division in 5' and 3'-ends used in this paper. The analyzed zinc finger and DNA binding domains are indicated. Numbers indicate the amino acid positions in Mus musculus RAG1 sequence. "ZDD" indicates the zinc binding domain (which contains the zinc finger domain analyzed in this paper). "NBR" indicates the nonamer binding domain (which corresponds to the DNA binding domain analyzed in this paper). "Core region" indicates the region of the protein needed for cleavage activity (see [17] for additional information).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2803162&req=5

Figure 1: Organization of RAG1 in Mus musculus. Organization of RAG1 (based on Mus musculus, P15919 of http://www.uniprot.org/[26]). The vertical line indicates the division in 5' and 3'-ends used in this paper. The analyzed zinc finger and DNA binding domains are indicated. Numbers indicate the amino acid positions in Mus musculus RAG1 sequence. "ZDD" indicates the zinc binding domain (which contains the zinc finger domain analyzed in this paper). "NBR" indicates the nonamer binding domain (which corresponds to the DNA binding domain analyzed in this paper). "Core region" indicates the region of the protein needed for cleavage activity (see [17] for additional information).
Mentions: The protein products of the two lymphocyte-specific recombination activating genes, Rag1 and Rag2, play an essential role in the host's active immune response to the different pathogens (see [17] and references therein for specific different activity of each protein in the immunological response), starting the process that generates specific receptors on B and T lymphocytes. The immune system is able to target and destroy many different foreign invaders as a result of the vast number of these specific receptors. The specificity of these receptors is made possible by a process known as V(D)J joining. This mechanism occurs in vertebrates and relies on the shuffling and recombination of different pre-existing gene fragments (V (variable), J (joining) and in some case D (diversity)) [17]. The first step of this set of reactions is the recognition and cleavage of a well conserved Recombination Signal Sequence (RSS), consisting of seven or nine nucleotide sequences separated from each other by a spacer of 12 or 23 bp [18]. The Rag1 coding sequence contains a conserved protein structural domain that binds the RSS [19]. The active site for the RSS binding and DNA cleavage is contained in part of the so-called the "core RAG1 domain", which also contains the nonamer-binding region (NBR, Figure 1) and three active residues. The recombination process requires that the RAG1 and RAG2 proteins act together as a heterodimer to recognize the RSS (reviewed in [20]) and introduce a break between the RSS and the adjacent V(D)J coding segments [21]. Almost the entire length of RAG1 is involved in codifying for different protein functions (e.g., sites involving in binding RAG2, sites involving in binding the RSS, Figure 1 and [17,22] for additional information on RAG1 structure).

Bottom Line: Among the nuclear markers currently used for phylogenetic purposes, Rag1 has especially enjoyed enormous popularity, since it successfully contributed to elucidating the relationships among and within a large variety of vertebrate lineages.This result is also paralleled by taxonomic differences in the GC content at the same codon position.However, in some vertebrate lineages the 5'-end of the gene is not yet widely used for phylogenetic studies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany.

ABSTRACT

Background: The Recombination Activating Proteins, RAG1 and RAG2, play a crucial role in the immune response in vertebrates. Among the nuclear markers currently used for phylogenetic purposes, Rag1 has especially enjoyed enormous popularity, since it successfully contributed to elucidating the relationships among and within a large variety of vertebrate lineages. We here report on a comparative investigation of the genetic variation, base composition, presence of indels, and selection in Rag1 in different vertebrate lineages (Actinopterygii, Amphibia, Aves, Chondrichthyes, Crocodylia, Lepidosauria, Mammalia, and Testudines) through the analysis of 582 sequences obtained from Genbank. We also analyze possible differences between distinct parts of the gene with different type of protein functions.

Results: In the vertebrate lineages studied, Rag1 is over 3 kb long. We observed a high level of heterogeneity in base composition at the 3(rd )codon position in some of the studied vertebrate lineages and in some specific taxa. This result is also paralleled by taxonomic differences in the GC content at the same codon position. Moreover, positive selection occurs at some sites in Aves, Lepidosauria and Testudines. Indels, which are often used as phylogenetic characters, are more informative across vertebrates in the 5' than in the 3'-end of the gene. When the entire gene is considered, the use of indels as phylogenetic character only recovers one major vertebrate clade, the Actinopterygii. However, in numerous cases insertions or deletions are specific to a monophyletic group.

Conclusions: Rag1 is a phylogenetic marker of undoubted quality. Our study points to the need of carrying out a preliminary investigation on the base composition and the possible existence of sites under selection of this gene within the groups studied to avoid misleading resolution. The gene shows highly heterogeneous base composition, which affects some taxa in particular and contains sites under positive selection in some vertebrate lineages in the 5'-end. The first part of the gene (5'-end) is more variable than the second (3'-end), and less affected by a heterogeneous base composition. However, in some vertebrate lineages the 5'-end of the gene is not yet widely used for phylogenetic studies.

No MeSH data available.