The evolution and genomic landscape of CGB1 and CGB2 genes.
The origin of completely novel proteins is a significant question in evolution.Two genes in this cluster (CGB1 and CGB2) exhibit nucleotide sequence similarity with the other LHB/CGB genes, but as a result of frameshifting are predicted to encode a completely novel protein.In silico prediction of putative transcription factor binding sites supports the hypothesis that CGB1 and CGB2 gene products are expressed in, and may contribute to, implantation and placental development.
Affiliation: Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia.
The origin of completely novel proteins is a significant question in evolution. The luteinizing hormone (LHB)/chorionic gonadotropin (CGB) gene cluster in humans contains a candidate example of this process. Two genes in this cluster (CGB1 and CGB2) exhibit nucleotide sequence similarity with the other LHB/CGB genes, but as a result of frameshifting are predicted to encode a completely novel protein. Our analysis of these genes from humans and related primates indicates a recent origin in the lineage specific to humans and African great apes. While the function of these genes is not yet known, they are strongly conserved between human and chimpanzee and exhibit three-fold lower diversity than LHB across human populations with no mutations that would disrupt the coding sequence. The 5'-upstream region of CGB1/2 contains most of the promoter sequence of hCGbeta plus a novel region proximal to the putative transcription start site. In silico prediction of putative transcription factor binding sites supports the hypothesis that CGB1 and CGB2 gene products are expressed in, and may contribute to, implantation and placental development.
- Chorionic Gonadotropin, beta Subunit, Human/genetics*
- Evolution, Molecular*
- Base Sequence
- Binding Sites
- Codon, Nonsense/genetics
- Conserved Sequence
- Embryo Implantation/genetics
- Luteinizing Hormone/genetics
- Molecular Sequence Data
- Polymorphism, Single Nucleotide/genetics
- Promoter Regions, Genetic/genetics
- Transcription Factors/metabolism
© Copyright Policy
fig1: Genomic context of CGB1 and CGB2. (A) Schematic presentation of the structure of the LHB/CGB gene cluster (covering 39.76 kb from LHB to CGB7) drawn to an approximate scale. Individual LHB/CGB genes (white boxes) cover 1.11–1.466 kb. Arrows indicate the direction of transcription either from a sense or an antisense strand. Experimentally identified hCGβ promoter sequence (Otani et al., 1988; white ovals) is also present, although more distally, upstream of LHB, CGB1 and CGB2 genes. Detailed alignment of the promoter area is shown in C. CGB1 and CGB2 specific insert is divided into a transcribed segment coding for 5′-UTR, exon1 and part of intron 1 of CGB1/CGB2 (black boxes; 255 bp) and an immediate 5′-upstream segment, which could serve as an additional promoter component (black ovals; CGB1 481 bp, CGB2 469 bp). Alignment of the non-coding 5′-upstream part of the insert is in D. Intergenic Neutrophin 6 pseudogenes (psNTF6; striped boxes; <1.15 kb) originate through duplication from Neutrophin 5 (NTF5) exon 3 (Hallast et al., 2005). (B) Structure of CGB1 and CGB2 differs from a consensus hCGβ gene in the following aspects: (1) hCGβ 5′-UTR has been replaced by a CGB1/2-specific insert coding for CGB1/2 5′-UTR, exon 1 (diagonally striped box) and part of intron 1 (black box) as well as provides a 481/469 bp upstream fragment, which could function as an additional promoter segment (black oval); (2) hCGβ exon 1 (horizontally striped box) is a part of CGB1/2 intron 1; (3) open reading frame (ORF) of exons 2 and 3 of CGB1/2 (grey boxes) has a-1bp shifted compared to hCGβ coding genes; (4) shifted ORF has lead to earlier STOP codon and shorter exon 3. An alternative exon 1 and shifted ORF for exons 2 and 3 code for a putative CGB1/2 protein with no amino acid similarity to hCGβ-subunit. (C) Alignment of the proximal promoter of hCGβ subunit coding genes (CGB, CGB5, CGB8, CGB7) with the homologous upstream segment of CGB1 and CGB2. cAMP response element has been mapped from −311 to −202 (Albanese et al., 1991; black brackets), trophoblast-specific element TSE from −305 to −279 (Steger et al., 1993; dotted brackets). Other experimentally proven regulatory elements of hCGβ promoter include activating protein 2 (AP2) and selective promoter factor 1 (Sp1) (Johnson and Jameson, 1999) as well as Ets-2 binding sites (Ghosh et al., 2003). *CCAAT box has been identified by Matinspector and Alibaba TFBS prediction softwares. (D). Prediction of transcription factor binding sites (TFBS) onto the 5′-upstream segment unique to CGB1 and CGB2 created by the insertion (B). TFBSs predicted by both MatInspector and Alibaba methods are marked with solid arrows above the aligned sequences of CGB1 and CGB2; TFBSs recognized by MatInspector alone are marked by broken arrows. TFBSs predicted solely based on CGB1 sequence are indicated with (*) and based on CGB2 (**). ATF: activating transcription factor; AP2: activating protein 2; Cdx2: Caudal-related transcription factor; CREB: cAMP responsive element binding protein; ERE: Estrogen response element; HIF: Hypoxia-inducible factor 1; NFkappaB: nuclear factor κB; GATA2: GATA-biding protein 2; SF1: steroidogenic factor 1; Sp1: selective promoter factor 1. Transcription start site has been indicated based on NCBI GenBank locus no NG_000019 information.
The human luteinizing hormone/chorionic gonadotropin beta (LHB/CGB) gene cluster on chromosome 19q13.3 consists of one LHB gene and six CGB genes (Fiddes and Goodman, 1980; Talmadge et al., 1984a; Maston and Ruvolo, 2002; Fig. 1A). These seven genes are highly conserved at the nucleotide level (85–99% DNA sequence identity) and appear to have originated from an ancestral LHB gene as a result of duplication during primate evolution. Four of the genes (CGB, CGB5, CGB7 and CGB8) encode the beta subunit of human chorionic gonadotropin, a 163 amino acid protein that is produced by the implanting conceptus and is essential for alternations to the maternal reproductive system in support of pregnancy. The other CGB genes, CGB1 and CGB2, encode a hypothetical protein of 132 amino acids that is completely different from the hCGβ-subunit and lacks similarity to any known protein (Bo and Boime, 1992). These genes appear to have evolved by insertion of a DNA fragment (736 bp for CGB1, 724 bp for CGB2) that replaces 52 bp of the proximal end of the promoter and the entire 5′-UTR of an ancestral hCGβ-subunit coding gene (Bo and Boime, 1992; Hollenberg et al., 1994; Fig. 1B). This insertion creates a CGB1/CGB2 specific putative promoter fragment, an alternative 5′untranslated region (5′-UTR) and a novel exon 1, leading to a one basepair frameshift in the open reading frame (ORF) for exons 2 and 3.