Limits...
Complexity of the MSG gene family of Pneumocystis carinii.

Keely SP, Stringer JR - BMC Genomics (2009)

Bottom Line: Accounting for error reduced the number of truly distinct sequences observed to 158, roughly twice the number expected if the gene family contains 80 members.A set of sequences that represents most if not all of the members of the P. carinii MSG gene family was obtained.The protein-changing nature of the variation among these sequences suggests that the family has been shaped by selection for protein variation, which is consistent with the hypothesis that the MSG gene family functions to enhance phenotypic variation among the members of a population of P. carinii.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, Ohio, 45220, USA. keelysp@ucmail.uc.edu

ABSTRACT

Background: The relationship between the parasitic fungus Pneumocystis carinii and its host, the laboratory rat, presumably involves features that allow the fungus to circumvent attacks by the immune system. It is hypothesized that the major surface glycoprotein (MSG) gene family endows Pneumocystis with the capacity to vary its surface. This gene family is comprised of approximately 80 genes, which each are approximately 3 kb long. Expression of the MSG gene family is regulated by a cis-dependent mechanism that involves a unique telomeric site in the genome called the expression site. Only the MSG gene adjacent to the expression site is represented by messenger RNA. Several P. carinii MSG genes have been sequenced, which showed that genes in the family can encode distinct isoforms of MSG. The vast majority of family members have not been characterized at the sequence level.

Results: The first 300 basepairs of MSG genes were subjected to analysis herein. Analysis of 581 MSG sequence reads from P. carinii genomic DNA yielded 281 different sequences. However, many of the sequence reads differed from others at only one site, a degree of variation consistent with that expected to be caused by error. Accounting for error reduced the number of truly distinct sequences observed to 158, roughly twice the number expected if the gene family contains 80 members. The size of the gene family was verified by PCR. The excess of distinct sequences appeared to be due to allelic variation. Discounting alleles, there were 73 different MSG genes observed. The 73 genes differed by 19% on average. Variable regions were rich in nucleotide differences that changed the encoded protein. The genes shared three regions in which at least 16 consecutive basepairs were invariant. There were numerous cases where two different genes were identical within a region that was variable among family members as a whole, suggesting recombination among family members.

Conclusion: A set of sequences that represents most if not all of the members of the P. carinii MSG gene family was obtained. The protein-changing nature of the variation among these sequences suggests that the family has been shaped by selection for protein variation, which is consistent with the hypothesis that the MSG gene family functions to enhance phenotypic variation among the members of a population of P. carinii.

Show MeSH
Maps of expressed and donor MSG genes. The expressed MSG gene is adjacent to the UCS sequence. Donor MSG genes are not adjacent to the UCS sequence, which is unique in the genome. For illustrative purposes, just three of approximately 80 donor genes are depicted. There is a copy of the CRJE, which is a 24 basepair conserved sequence, at the beginning of MSG genes, including the one attached to the UCS. The discontinuities in the MSG genes are there to indicate that the genes are not drawn to size relative to the UCS and CRJE. The open horizontal arrows show the locations and orientations of PCR primers. The two arrows above the expressed gene represent the -145 and α-CRJE primers (Additional file 3, Table S2). The two arrows above the first donor gene represent the CRJE and C2 primers (Additional file 3, Table S2).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2743713&req=5

Figure 1: Maps of expressed and donor MSG genes. The expressed MSG gene is adjacent to the UCS sequence. Donor MSG genes are not adjacent to the UCS sequence, which is unique in the genome. For illustrative purposes, just three of approximately 80 donor genes are depicted. There is a copy of the CRJE, which is a 24 basepair conserved sequence, at the beginning of MSG genes, including the one attached to the UCS. The discontinuities in the MSG genes are there to indicate that the genes are not drawn to size relative to the UCS and CRJE. The open horizontal arrows show the locations and orientations of PCR primers. The two arrows above the expressed gene represent the -145 and α-CRJE primers (Additional file 3, Table S2). The two arrows above the first donor gene represent the CRJE and C2 primers (Additional file 3, Table S2).

Mentions: Expression of MSG gene families has been studied primarily in P. carinii, where several lines of evidence indicate that a single MSG gene is transcribed in a given P. carinii genome at a given time. Restricted transcription is accomplished via a cis-dependent mechanism that involves a unique telomeric site in the genome called the expression site. Only the MSG gene adjacent to the expression site is represented by messenger RNA [52,53,62,63]. The MSG protein on the surface of P. carinii organisms has been shown to vary and to be encoded by the MSG gene that is at the expression site [64,65]. The expression site contains the UCS (Upstream Conserved Sequence), a sequence found at the beginning of messenger RNAs encoding diverse MSG proteins [62,63] (Figure 1). Immediately adjacent to and downstream of the UCS, there is short sequence, called the CRJE, which is conserved among all MSG genes, by definition [52,62,66].


Complexity of the MSG gene family of Pneumocystis carinii.

Keely SP, Stringer JR - BMC Genomics (2009)

Maps of expressed and donor MSG genes. The expressed MSG gene is adjacent to the UCS sequence. Donor MSG genes are not adjacent to the UCS sequence, which is unique in the genome. For illustrative purposes, just three of approximately 80 donor genes are depicted. There is a copy of the CRJE, which is a 24 basepair conserved sequence, at the beginning of MSG genes, including the one attached to the UCS. The discontinuities in the MSG genes are there to indicate that the genes are not drawn to size relative to the UCS and CRJE. The open horizontal arrows show the locations and orientations of PCR primers. The two arrows above the expressed gene represent the -145 and α-CRJE primers (Additional file 3, Table S2). The two arrows above the first donor gene represent the CRJE and C2 primers (Additional file 3, Table S2).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2743713&req=5

Figure 1: Maps of expressed and donor MSG genes. The expressed MSG gene is adjacent to the UCS sequence. Donor MSG genes are not adjacent to the UCS sequence, which is unique in the genome. For illustrative purposes, just three of approximately 80 donor genes are depicted. There is a copy of the CRJE, which is a 24 basepair conserved sequence, at the beginning of MSG genes, including the one attached to the UCS. The discontinuities in the MSG genes are there to indicate that the genes are not drawn to size relative to the UCS and CRJE. The open horizontal arrows show the locations and orientations of PCR primers. The two arrows above the expressed gene represent the -145 and α-CRJE primers (Additional file 3, Table S2). The two arrows above the first donor gene represent the CRJE and C2 primers (Additional file 3, Table S2).
Mentions: Expression of MSG gene families has been studied primarily in P. carinii, where several lines of evidence indicate that a single MSG gene is transcribed in a given P. carinii genome at a given time. Restricted transcription is accomplished via a cis-dependent mechanism that involves a unique telomeric site in the genome called the expression site. Only the MSG gene adjacent to the expression site is represented by messenger RNA [52,53,62,63]. The MSG protein on the surface of P. carinii organisms has been shown to vary and to be encoded by the MSG gene that is at the expression site [64,65]. The expression site contains the UCS (Upstream Conserved Sequence), a sequence found at the beginning of messenger RNAs encoding diverse MSG proteins [62,63] (Figure 1). Immediately adjacent to and downstream of the UCS, there is short sequence, called the CRJE, which is conserved among all MSG genes, by definition [52,62,66].

Bottom Line: Accounting for error reduced the number of truly distinct sequences observed to 158, roughly twice the number expected if the gene family contains 80 members.A set of sequences that represents most if not all of the members of the P. carinii MSG gene family was obtained.The protein-changing nature of the variation among these sequences suggests that the family has been shaped by selection for protein variation, which is consistent with the hypothesis that the MSG gene family functions to enhance phenotypic variation among the members of a population of P. carinii.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, Ohio, 45220, USA. keelysp@ucmail.uc.edu

ABSTRACT

Background: The relationship between the parasitic fungus Pneumocystis carinii and its host, the laboratory rat, presumably involves features that allow the fungus to circumvent attacks by the immune system. It is hypothesized that the major surface glycoprotein (MSG) gene family endows Pneumocystis with the capacity to vary its surface. This gene family is comprised of approximately 80 genes, which each are approximately 3 kb long. Expression of the MSG gene family is regulated by a cis-dependent mechanism that involves a unique telomeric site in the genome called the expression site. Only the MSG gene adjacent to the expression site is represented by messenger RNA. Several P. carinii MSG genes have been sequenced, which showed that genes in the family can encode distinct isoforms of MSG. The vast majority of family members have not been characterized at the sequence level.

Results: The first 300 basepairs of MSG genes were subjected to analysis herein. Analysis of 581 MSG sequence reads from P. carinii genomic DNA yielded 281 different sequences. However, many of the sequence reads differed from others at only one site, a degree of variation consistent with that expected to be caused by error. Accounting for error reduced the number of truly distinct sequences observed to 158, roughly twice the number expected if the gene family contains 80 members. The size of the gene family was verified by PCR. The excess of distinct sequences appeared to be due to allelic variation. Discounting alleles, there were 73 different MSG genes observed. The 73 genes differed by 19% on average. Variable regions were rich in nucleotide differences that changed the encoded protein. The genes shared three regions in which at least 16 consecutive basepairs were invariant. There were numerous cases where two different genes were identical within a region that was variable among family members as a whole, suggesting recombination among family members.

Conclusion: A set of sequences that represents most if not all of the members of the P. carinii MSG gene family was obtained. The protein-changing nature of the variation among these sequences suggests that the family has been shaped by selection for protein variation, which is consistent with the hypothesis that the MSG gene family functions to enhance phenotypic variation among the members of a population of P. carinii.

Show MeSH