Limits...
Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures.

Harrison P, Yu Z - BMC Genomics (2007)

Bottom Line: We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions.We do not find any evidence for evolution of novelty in protein structures through frameshifting.Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, McGill University, Stewart Biology Building, 1205 Docteur Penfield Ave,, Montreal, QC, H3A 1B1 Canada. paul.harrison@mcgill.ca

ABSTRACT

Background: Efforts to gather genomic evidence for the processes of gene evolution are ongoing, and are closely coupled to improved gene annotation methods. Such annotation is complicated by the occurrence of disrupted mRNAs (dmRNAs), harbouring frameshifts and premature stop codons, which can be considered indicators of decay into pseudogenes.

Results: We have derived a procedure to annotate dmRNAs, and have applied it to human data. Subsequences are generated from parsing at key frame-disruption positions and are required to align significantly within any original protein homology. We find 419 high-quality human dmRNAs (3% of total). Significant dmRNA subpopulations include: zinc-finger-containing transcription factors with long disrupted exons, and antisense homologies to distal genes. We analysed the distribution of initial frame disruptions in dmRNAs with respect to positions of: (i) protein domains, (ii) alternatively-spliced exons, and (iii) regions susceptible to nonsense-mediated decay (NMD). We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions. We do not find any evidence for evolution of novelty in protein structures through frameshifting.

Conclusion: Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

Show MeSH

Related in: MedlinePlus

Three examples of dmRNAs. The translated dmRNA sequence is shown along with the corresponding nucleotide sequence; the aligning protein sequence is shown above these in each case. They are as follows: (a) a multiply-disrupted example (homologous to a cytochrome P450); (b) a multiply-disrupted example from a zinc-finger -containing transcription factor family; (c) an alternative splicing of the transmembrane sugar transporter gene, C20orf59, which appears to be a transmembrane sugar transporter.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194788&req=5

Figure 1: Three examples of dmRNAs. The translated dmRNA sequence is shown along with the corresponding nucleotide sequence; the aligning protein sequence is shown above these in each case. They are as follows: (a) a multiply-disrupted example (homologous to a cytochrome P450); (b) a multiply-disrupted example from a zinc-finger -containing transcription factor family; (c) an alternative splicing of the transmembrane sugar transporter gene, C20orf59, which appears to be a transmembrane sugar transporter.

Mentions: Using stringent thresholds, we verified 16,153 high-quality mRNAs from the NCBI Refseq and Unigene consensus collections, through mapping onto human genomic DNA. A small subpopulation of these (419, or 3% of the total) mRNAs harbour significant frame disruptions (either frameshifts or premature stop codons) (Table 1), which is of a similar order to previous analyses of such disruptions in sets of transcripts [16,2,9]. Most of these are disrupted by frameshifts (83% of cases), rather than premature stop codons. Using a small modification to the basic annotation pipeline, we defined a small minority of these frameshifted transcripts (17, 4% of the dmRNAs) that harbour compensating frameshifts, resulting in movement back into frame. Previous analysis of mouse cDNAs also indicated that a small fraction of them (~2%) may have such compensatory frameshifts [16]. Three examples of dmRNAs are illustrated in Figure 1. There are two multiply-disrupted examples (homologous to a cytochrome P450, and to a zinc-finger -containing transcription factor), and a frameshifted alternative mRNA transcript, from the gene C20orf59, which appears to be a transmembrane sugar transporter.


Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures.

Harrison P, Yu Z - BMC Genomics (2007)

Three examples of dmRNAs. The translated dmRNA sequence is shown along with the corresponding nucleotide sequence; the aligning protein sequence is shown above these in each case. They are as follows: (a) a multiply-disrupted example (homologous to a cytochrome P450); (b) a multiply-disrupted example from a zinc-finger -containing transcription factor family; (c) an alternative splicing of the transmembrane sugar transporter gene, C20orf59, which appears to be a transmembrane sugar transporter.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194788&req=5

Figure 1: Three examples of dmRNAs. The translated dmRNA sequence is shown along with the corresponding nucleotide sequence; the aligning protein sequence is shown above these in each case. They are as follows: (a) a multiply-disrupted example (homologous to a cytochrome P450); (b) a multiply-disrupted example from a zinc-finger -containing transcription factor family; (c) an alternative splicing of the transmembrane sugar transporter gene, C20orf59, which appears to be a transmembrane sugar transporter.
Mentions: Using stringent thresholds, we verified 16,153 high-quality mRNAs from the NCBI Refseq and Unigene consensus collections, through mapping onto human genomic DNA. A small subpopulation of these (419, or 3% of the total) mRNAs harbour significant frame disruptions (either frameshifts or premature stop codons) (Table 1), which is of a similar order to previous analyses of such disruptions in sets of transcripts [16,2,9]. Most of these are disrupted by frameshifts (83% of cases), rather than premature stop codons. Using a small modification to the basic annotation pipeline, we defined a small minority of these frameshifted transcripts (17, 4% of the dmRNAs) that harbour compensating frameshifts, resulting in movement back into frame. Previous analysis of mouse cDNAs also indicated that a small fraction of them (~2%) may have such compensatory frameshifts [16]. Three examples of dmRNAs are illustrated in Figure 1. There are two multiply-disrupted examples (homologous to a cytochrome P450, and to a zinc-finger -containing transcription factor), and a frameshifted alternative mRNA transcript, from the gene C20orf59, which appears to be a transmembrane sugar transporter.

Bottom Line: We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions.We do not find any evidence for evolution of novelty in protein structures through frameshifting.Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, McGill University, Stewart Biology Building, 1205 Docteur Penfield Ave,, Montreal, QC, H3A 1B1 Canada. paul.harrison@mcgill.ca

ABSTRACT

Background: Efforts to gather genomic evidence for the processes of gene evolution are ongoing, and are closely coupled to improved gene annotation methods. Such annotation is complicated by the occurrence of disrupted mRNAs (dmRNAs), harbouring frameshifts and premature stop codons, which can be considered indicators of decay into pseudogenes.

Results: We have derived a procedure to annotate dmRNAs, and have applied it to human data. Subsequences are generated from parsing at key frame-disruption positions and are required to align significantly within any original protein homology. We find 419 high-quality human dmRNAs (3% of total). Significant dmRNA subpopulations include: zinc-finger-containing transcription factors with long disrupted exons, and antisense homologies to distal genes. We analysed the distribution of initial frame disruptions in dmRNAs with respect to positions of: (i) protein domains, (ii) alternatively-spliced exons, and (iii) regions susceptible to nonsense-mediated decay (NMD). We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions. We do not find any evidence for evolution of novelty in protein structures through frameshifting.

Conclusion: Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

Show MeSH
Related in: MedlinePlus