Limits...
The ancient mammalian KRAB zinc finger gene cluster on human chromosome 8q24.3 illustrates principles of C2H2 zinc finger evolution associated with unique expression profiles in human tissues.

Lorenz P, Dietmann S, Wilhelm T, Koczan D, Autran S, Gad S, Wen G, Ding G, Li Y, Rousseau-Merck MF, Thiesen HJ - BMC Genomics (2010)

Bottom Line: Expansion of multi-C2H2 domain zinc finger (ZNF) genes, including the Krüppel-associated box (KRAB) subfamily, paralleled the evolution of tetrapodes, particularly in mammalian lineages.Six (ZNF7, ZNF34, ZNF250, ZNF251, ZNF252, ZNF517) of the seven locus members contain exons encoding KRAB domains, one (ZNF16) does not.These results are consistent with potential functions of the ZNF genes in morphogenesis and differentiation.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Immunology, University of Rostock, Schillingallee 70, 18055 Rostock, Germany.

ABSTRACT

Background: Expansion of multi-C2H2 domain zinc finger (ZNF) genes, including the Krüppel-associated box (KRAB) subfamily, paralleled the evolution of tetrapodes, particularly in mammalian lineages. Advances in their cataloging and characterization suggest that the functions of the KRAB-ZNF gene family contributed to mammalian speciation.

Results: Here, we characterized the human 8q24.3 ZNF cluster on the genomic, the phylogenetic, the structural and the transcriptome level. Six (ZNF7, ZNF34, ZNF250, ZNF251, ZNF252, ZNF517) of the seven locus members contain exons encoding KRAB domains, one (ZNF16) does not. They form a paralog group in which the encoded KRAB and ZNF protein domains generally share more similarities with each other than with other members of the human ZNF superfamily. The closest relatives with respect to their DNA-binding domain were ZNF7 and ZNF251. The analysis of orthologs in therian mammalian species revealed strong conservation and purifying selection of the KRAB-A and zinc finger domains. These findings underscore structural/functional constraints during evolution. Gene losses in the murine lineage (ZNF16, ZNF34, ZNF252, ZNF517) and potential protein truncations in primates (ZNF252) illustrate ongoing speciation processes. Tissue expression profiling by quantitative real-time PCR showed similar but distinct patterns for all tested ZNF genes with the most prominent expression in fetal brain. Based on accompanying expression signatures in twenty-six other human tissues ZNF34 and ZNF250 revealed the closest expression profiles. Together, the 8q24.3 ZNF genes can be assigned to a cerebellum, a testis or a prostate/thyroid subgroup. These results are consistent with potential functions of the ZNF genes in morphogenesis and differentiation. Promoter regions of the seven 8q24.3 ZNF genes display common characteristics like missing TATA-box, CpG island-association and transcription factor binding site (TFBS) modules. Common TFBS modules partly explain the observed expression pattern similarities.

Conclusions: The ZNF genes at human 8q24.3 form a relatively old mammalian paralog group conserved in eutherian mammals for at least 130 million years. The members persisted after initial duplications by undergoing subfunctionalizations in their expression patterns and target site recognition. KRAB-ZNF mediated repression of transcription might have shaped organogenesis in mammalian ontogeny.

Show MeSH

Related in: MedlinePlus

Evaluation of the conservation of the C2H2 ZNF DNA binding domains. (A) Principal component analysis of ZNF domains for conservation based on the 8-amino acid region from -2 to 6 with respect to the start of the α-helix of each finger (see text). Included are all individual fingers from the 8q24.3 ZNF proteins and their mouse and rat orthologs. The plot shows the first (PC1) against the second (PC2) principal component representing the variation in position 5 and 6 or -2 and 1, respectively. Negative values are indicative of lower variability and thus higher conservation. Plot areas that contain the same or similar 8-residue regions are boxed. Conserved amino acids are highlighted in single letter code at respective positions. Dashes indicate non-conserved residues. Additional file 7 details all peptides and their values/coordinates based on the boxed areas. (B) Pairwise ZNF matrix similarities between the 8q24.3 ZNF locus members (see text). Numbers in red indicate lowest values, i.e. highest similarities of each ZNF gene. (C) Detail of a paralog network founded on pairwise ZNF sequence similarities between all human C2H2 ZNF genes (see text). Nodes represent the individual genes (labeled by name), the edges describe their similarity. The thickness of the edges is proportional to the similarity and the value is given as label. A decrease in the value means an increase in similarity. Shown are the isolated 8q24.3 and 19q13.2 clusters of the network with 8q24.3 ZNF member nodes in red, and nodes of 19q13.2 members included in Figure 2 in blue. Network restricted to similarity values ≤ 2.01.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2865497&req=5

Figure 6: Evaluation of the conservation of the C2H2 ZNF DNA binding domains. (A) Principal component analysis of ZNF domains for conservation based on the 8-amino acid region from -2 to 6 with respect to the start of the α-helix of each finger (see text). Included are all individual fingers from the 8q24.3 ZNF proteins and their mouse and rat orthologs. The plot shows the first (PC1) against the second (PC2) principal component representing the variation in position 5 and 6 or -2 and 1, respectively. Negative values are indicative of lower variability and thus higher conservation. Plot areas that contain the same or similar 8-residue regions are boxed. Conserved amino acids are highlighted in single letter code at respective positions. Dashes indicate non-conserved residues. Additional file 7 details all peptides and their values/coordinates based on the boxed areas. (B) Pairwise ZNF matrix similarities between the 8q24.3 ZNF locus members (see text). Numbers in red indicate lowest values, i.e. highest similarities of each ZNF gene. (C) Detail of a paralog network founded on pairwise ZNF sequence similarities between all human C2H2 ZNF genes (see text). Nodes represent the individual genes (labeled by name), the edges describe their similarity. The thickness of the edges is proportional to the similarity and the value is given as label. A decrease in the value means an increase in similarity. Shown are the isolated 8q24.3 and 19q13.2 clusters of the network with 8q24.3 ZNF member nodes in red, and nodes of 19q13.2 members included in Figure 2 in blue. Network restricted to similarity values ≤ 2.01.

Mentions: Specific amino acid residues positioned in the C2H2 zinc finger domain play a key role in determining their nucleic acid binding specificities. Based on the EGR1/Zif268 protein-DNA crystal structure, helical positions -1, 3 and 6 with respect to the start of the α-helix are especially important for DNA-binding specificity [17]. When comparing these important residues along each group of orthologs (from opossum to human, as available), it became immediately evident, that they are highly conserved (see graphical depiction in Additional file 6). Analysis of the mutational trends in this region of the α-helix between the 8q24.3 ZNF paralogues should help to clarify likely duplication scenarios and functional, i.e. nucleic acid binding residue, variability. To that purpose, principal component analysis of the conservation profile of the individual zinc finger sequences was performed. The analysis is based on the multiple alignment of the region encompassing the 8-residue long stretches from positions -2 to 6 with respect to the α-helix of each C2H2 zinc finger. Principal component analysis identifies new axes in a multiple alignment matrix by weighting positions with high co-variation and deemphasizing positions that show little co-variation with other positions. Positions in the stretch of binding residues most tightly connected with one another thus reflect correlated mutations that are under evolutionary selection and are more likely to be important for nucleic acid binding. The first principal component, plotted on the x-axis of Figure 6A, contains position 6 (weight -0.82) and the correlated position 5 (weight -0.44). The second principal component, plotted on the y-axis, is based on position -2 (-0.92) and 1 (-0.27). For both axes, zinc-finger domains sharing the prevalent amino acids in the respective positions are in the negative and genes with mutations in these positions in the positive regions. The sequences assigned to the different regions with the plotted matrix values are given in Additional file 7. Region I contains all zinc finger domains with a significantly overrepresented motif S [Q, R]S---IQ (a dash stands for any amino acid) with frequencies S(49%), [Q(35%), R(19%)], S(45%), I(44%), and Q(49%). S-----IQ is found in nearly all 8q13.4 genes (only ZNF7 has the mutational variant R-----IQ). The subgroups in region I represent the motif SQ----IQ present in individual zinc fingers of human/mouse/rat ZNF250, human ZNF16, ZNF34 and ZNF252, and its relative, the motif SR----IQ present in human/mouse/rat ZNF251 as well as human ZNF517 zinc fingers. Region I also includes the motif S-S---IQ found in C2H2 domains of human/mouse/rat ZNF251, and human ZNF34 and ZNF16. In zinc finger domains of region II the serine is still at position -2; however, mutations in positions 5 and 6 have occurred (e.g. zinc fingers ZNF7-12 SQ----IY, ZNF7-2 SD----KH, while in region III the serine at -2 is lost, but the IQ at position 5 and 6 is conserved (e.g. ZNF7-4 RL----IQ). As expected, residues in the binding region of the 8q24.3 zinc finger domains are frequently modified to alter or refine their binding specificity; yet, specific amino acids at positions -2, 1 and 5, 6 are significantly overrepresented in the 8q23.4 sub-family of zinc fingers pointing to a common conserved framework of DNA binding.


The ancient mammalian KRAB zinc finger gene cluster on human chromosome 8q24.3 illustrates principles of C2H2 zinc finger evolution associated with unique expression profiles in human tissues.

Lorenz P, Dietmann S, Wilhelm T, Koczan D, Autran S, Gad S, Wen G, Ding G, Li Y, Rousseau-Merck MF, Thiesen HJ - BMC Genomics (2010)

Evaluation of the conservation of the C2H2 ZNF DNA binding domains. (A) Principal component analysis of ZNF domains for conservation based on the 8-amino acid region from -2 to 6 with respect to the start of the α-helix of each finger (see text). Included are all individual fingers from the 8q24.3 ZNF proteins and their mouse and rat orthologs. The plot shows the first (PC1) against the second (PC2) principal component representing the variation in position 5 and 6 or -2 and 1, respectively. Negative values are indicative of lower variability and thus higher conservation. Plot areas that contain the same or similar 8-residue regions are boxed. Conserved amino acids are highlighted in single letter code at respective positions. Dashes indicate non-conserved residues. Additional file 7 details all peptides and their values/coordinates based on the boxed areas. (B) Pairwise ZNF matrix similarities between the 8q24.3 ZNF locus members (see text). Numbers in red indicate lowest values, i.e. highest similarities of each ZNF gene. (C) Detail of a paralog network founded on pairwise ZNF sequence similarities between all human C2H2 ZNF genes (see text). Nodes represent the individual genes (labeled by name), the edges describe their similarity. The thickness of the edges is proportional to the similarity and the value is given as label. A decrease in the value means an increase in similarity. Shown are the isolated 8q24.3 and 19q13.2 clusters of the network with 8q24.3 ZNF member nodes in red, and nodes of 19q13.2 members included in Figure 2 in blue. Network restricted to similarity values ≤ 2.01.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2865497&req=5

Figure 6: Evaluation of the conservation of the C2H2 ZNF DNA binding domains. (A) Principal component analysis of ZNF domains for conservation based on the 8-amino acid region from -2 to 6 with respect to the start of the α-helix of each finger (see text). Included are all individual fingers from the 8q24.3 ZNF proteins and their mouse and rat orthologs. The plot shows the first (PC1) against the second (PC2) principal component representing the variation in position 5 and 6 or -2 and 1, respectively. Negative values are indicative of lower variability and thus higher conservation. Plot areas that contain the same or similar 8-residue regions are boxed. Conserved amino acids are highlighted in single letter code at respective positions. Dashes indicate non-conserved residues. Additional file 7 details all peptides and their values/coordinates based on the boxed areas. (B) Pairwise ZNF matrix similarities between the 8q24.3 ZNF locus members (see text). Numbers in red indicate lowest values, i.e. highest similarities of each ZNF gene. (C) Detail of a paralog network founded on pairwise ZNF sequence similarities between all human C2H2 ZNF genes (see text). Nodes represent the individual genes (labeled by name), the edges describe their similarity. The thickness of the edges is proportional to the similarity and the value is given as label. A decrease in the value means an increase in similarity. Shown are the isolated 8q24.3 and 19q13.2 clusters of the network with 8q24.3 ZNF member nodes in red, and nodes of 19q13.2 members included in Figure 2 in blue. Network restricted to similarity values ≤ 2.01.
Mentions: Specific amino acid residues positioned in the C2H2 zinc finger domain play a key role in determining their nucleic acid binding specificities. Based on the EGR1/Zif268 protein-DNA crystal structure, helical positions -1, 3 and 6 with respect to the start of the α-helix are especially important for DNA-binding specificity [17]. When comparing these important residues along each group of orthologs (from opossum to human, as available), it became immediately evident, that they are highly conserved (see graphical depiction in Additional file 6). Analysis of the mutational trends in this region of the α-helix between the 8q24.3 ZNF paralogues should help to clarify likely duplication scenarios and functional, i.e. nucleic acid binding residue, variability. To that purpose, principal component analysis of the conservation profile of the individual zinc finger sequences was performed. The analysis is based on the multiple alignment of the region encompassing the 8-residue long stretches from positions -2 to 6 with respect to the α-helix of each C2H2 zinc finger. Principal component analysis identifies new axes in a multiple alignment matrix by weighting positions with high co-variation and deemphasizing positions that show little co-variation with other positions. Positions in the stretch of binding residues most tightly connected with one another thus reflect correlated mutations that are under evolutionary selection and are more likely to be important for nucleic acid binding. The first principal component, plotted on the x-axis of Figure 6A, contains position 6 (weight -0.82) and the correlated position 5 (weight -0.44). The second principal component, plotted on the y-axis, is based on position -2 (-0.92) and 1 (-0.27). For both axes, zinc-finger domains sharing the prevalent amino acids in the respective positions are in the negative and genes with mutations in these positions in the positive regions. The sequences assigned to the different regions with the plotted matrix values are given in Additional file 7. Region I contains all zinc finger domains with a significantly overrepresented motif S [Q, R]S---IQ (a dash stands for any amino acid) with frequencies S(49%), [Q(35%), R(19%)], S(45%), I(44%), and Q(49%). S-----IQ is found in nearly all 8q13.4 genes (only ZNF7 has the mutational variant R-----IQ). The subgroups in region I represent the motif SQ----IQ present in individual zinc fingers of human/mouse/rat ZNF250, human ZNF16, ZNF34 and ZNF252, and its relative, the motif SR----IQ present in human/mouse/rat ZNF251 as well as human ZNF517 zinc fingers. Region I also includes the motif S-S---IQ found in C2H2 domains of human/mouse/rat ZNF251, and human ZNF34 and ZNF16. In zinc finger domains of region II the serine is still at position -2; however, mutations in positions 5 and 6 have occurred (e.g. zinc fingers ZNF7-12 SQ----IY, ZNF7-2 SD----KH, while in region III the serine at -2 is lost, but the IQ at position 5 and 6 is conserved (e.g. ZNF7-4 RL----IQ). As expected, residues in the binding region of the 8q24.3 zinc finger domains are frequently modified to alter or refine their binding specificity; yet, specific amino acids at positions -2, 1 and 5, 6 are significantly overrepresented in the 8q23.4 sub-family of zinc fingers pointing to a common conserved framework of DNA binding.

Bottom Line: Expansion of multi-C2H2 domain zinc finger (ZNF) genes, including the Krüppel-associated box (KRAB) subfamily, paralleled the evolution of tetrapodes, particularly in mammalian lineages.Six (ZNF7, ZNF34, ZNF250, ZNF251, ZNF252, ZNF517) of the seven locus members contain exons encoding KRAB domains, one (ZNF16) does not.These results are consistent with potential functions of the ZNF genes in morphogenesis and differentiation.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Immunology, University of Rostock, Schillingallee 70, 18055 Rostock, Germany.

ABSTRACT

Background: Expansion of multi-C2H2 domain zinc finger (ZNF) genes, including the Krüppel-associated box (KRAB) subfamily, paralleled the evolution of tetrapodes, particularly in mammalian lineages. Advances in their cataloging and characterization suggest that the functions of the KRAB-ZNF gene family contributed to mammalian speciation.

Results: Here, we characterized the human 8q24.3 ZNF cluster on the genomic, the phylogenetic, the structural and the transcriptome level. Six (ZNF7, ZNF34, ZNF250, ZNF251, ZNF252, ZNF517) of the seven locus members contain exons encoding KRAB domains, one (ZNF16) does not. They form a paralog group in which the encoded KRAB and ZNF protein domains generally share more similarities with each other than with other members of the human ZNF superfamily. The closest relatives with respect to their DNA-binding domain were ZNF7 and ZNF251. The analysis of orthologs in therian mammalian species revealed strong conservation and purifying selection of the KRAB-A and zinc finger domains. These findings underscore structural/functional constraints during evolution. Gene losses in the murine lineage (ZNF16, ZNF34, ZNF252, ZNF517) and potential protein truncations in primates (ZNF252) illustrate ongoing speciation processes. Tissue expression profiling by quantitative real-time PCR showed similar but distinct patterns for all tested ZNF genes with the most prominent expression in fetal brain. Based on accompanying expression signatures in twenty-six other human tissues ZNF34 and ZNF250 revealed the closest expression profiles. Together, the 8q24.3 ZNF genes can be assigned to a cerebellum, a testis or a prostate/thyroid subgroup. These results are consistent with potential functions of the ZNF genes in morphogenesis and differentiation. Promoter regions of the seven 8q24.3 ZNF genes display common characteristics like missing TATA-box, CpG island-association and transcription factor binding site (TFBS) modules. Common TFBS modules partly explain the observed expression pattern similarities.

Conclusions: The ZNF genes at human 8q24.3 form a relatively old mammalian paralog group conserved in eutherian mammals for at least 130 million years. The members persisted after initial duplications by undergoing subfunctionalizations in their expression patterns and target site recognition. KRAB-ZNF mediated repression of transcription might have shaped organogenesis in mammalian ontogeny.

Show MeSH
Related in: MedlinePlus