Limits...
Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures.

Harrison P, Yu Z - BMC Genomics (2007)

Bottom Line: We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions.We do not find any evidence for evolution of novelty in protein structures through frameshifting.Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, McGill University, Stewart Biology Building, 1205 Docteur Penfield Ave,, Montreal, QC, H3A 1B1 Canada. paul.harrison@mcgill.ca

ABSTRACT

Background: Efforts to gather genomic evidence for the processes of gene evolution are ongoing, and are closely coupled to improved gene annotation methods. Such annotation is complicated by the occurrence of disrupted mRNAs (dmRNAs), harbouring frameshifts and premature stop codons, which can be considered indicators of decay into pseudogenes.

Results: We have derived a procedure to annotate dmRNAs, and have applied it to human data. Subsequences are generated from parsing at key frame-disruption positions and are required to align significantly within any original protein homology. We find 419 high-quality human dmRNAs (3% of total). Significant dmRNA subpopulations include: zinc-finger-containing transcription factors with long disrupted exons, and antisense homologies to distal genes. We analysed the distribution of initial frame disruptions in dmRNAs with respect to positions of: (i) protein domains, (ii) alternatively-spliced exons, and (iii) regions susceptible to nonsense-mediated decay (NMD). We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions. We do not find any evidence for evolution of novelty in protein structures through frameshifting.

Conclusion: Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

Show MeSH

Related in: MedlinePlus

Numbers of paralogs. The distribution of the number of paralogs for all genes, and for genes yielding dmRNAs. The bin labeled x contains all values N such that x-5 <N ≤ x.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194788&req=5

Figure 2: Numbers of paralogs. The distribution of the number of paralogs for all genes, and for genes yielding dmRNAs. The bin labeled x contains all values N such that x-5 <N ≤ x.

Mentions: In general, the dmRNAs demonstrate functional prevalences that are typical of the population of human transcripts in general, as judged from counting up Gene Ontology functional category annotations (Additional File 1). The duplication behaviour of the genes from which the disrupted mRNAs arise is also typical of the whole human gene complement (Figure 2; median value of 5 paralogs per gene for the disrupted mRNAs versus 6 for the whole set; mean = 36 [± 62] versus 32 [± 81]). However, dmRNAs have significantly fewer exons than mRNAs in general (mean = 7.9 [± 8.6] exons, compared to 10.0 [± 11.5] exons in general, P < 0.05 using normal statistics for the distribution of the sample mean). Such shorter lengths are expected from the truncating effect of frame-shifts and stop codons. A large fraction (44%) of the dmRNAs have multiple frame disruptions, with the frequencies of numbers of frame disruptions exhibiting a power-law relationship, as observed for processed pseudogenes [7,8] (Figure 3). The vast majority of frameshifts in dmRNAs (326/346, 94%)) result in truncation from premature stop codons.


Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures.

Harrison P, Yu Z - BMC Genomics (2007)

Numbers of paralogs. The distribution of the number of paralogs for all genes, and for genes yielding dmRNAs. The bin labeled x contains all values N such that x-5 <N ≤ x.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194788&req=5

Figure 2: Numbers of paralogs. The distribution of the number of paralogs for all genes, and for genes yielding dmRNAs. The bin labeled x contains all values N such that x-5 <N ≤ x.
Mentions: In general, the dmRNAs demonstrate functional prevalences that are typical of the population of human transcripts in general, as judged from counting up Gene Ontology functional category annotations (Additional File 1). The duplication behaviour of the genes from which the disrupted mRNAs arise is also typical of the whole human gene complement (Figure 2; median value of 5 paralogs per gene for the disrupted mRNAs versus 6 for the whole set; mean = 36 [± 62] versus 32 [± 81]). However, dmRNAs have significantly fewer exons than mRNAs in general (mean = 7.9 [± 8.6] exons, compared to 10.0 [± 11.5] exons in general, P < 0.05 using normal statistics for the distribution of the sample mean). Such shorter lengths are expected from the truncating effect of frame-shifts and stop codons. A large fraction (44%) of the dmRNAs have multiple frame disruptions, with the frequencies of numbers of frame disruptions exhibiting a power-law relationship, as observed for processed pseudogenes [7,8] (Figure 3). The vast majority of frameshifts in dmRNAs (326/346, 94%)) result in truncation from premature stop codons.

Bottom Line: We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions.We do not find any evidence for evolution of novelty in protein structures through frameshifting.Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, McGill University, Stewart Biology Building, 1205 Docteur Penfield Ave,, Montreal, QC, H3A 1B1 Canada. paul.harrison@mcgill.ca

ABSTRACT

Background: Efforts to gather genomic evidence for the processes of gene evolution are ongoing, and are closely coupled to improved gene annotation methods. Such annotation is complicated by the occurrence of disrupted mRNAs (dmRNAs), harbouring frameshifts and premature stop codons, which can be considered indicators of decay into pseudogenes.

Results: We have derived a procedure to annotate dmRNAs, and have applied it to human data. Subsequences are generated from parsing at key frame-disruption positions and are required to align significantly within any original protein homology. We find 419 high-quality human dmRNAs (3% of total). Significant dmRNA subpopulations include: zinc-finger-containing transcription factors with long disrupted exons, and antisense homologies to distal genes. We analysed the distribution of initial frame disruptions in dmRNAs with respect to positions of: (i) protein domains, (ii) alternatively-spliced exons, and (iii) regions susceptible to nonsense-mediated decay (NMD). We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions. We do not find any evidence for evolution of novelty in protein structures through frameshifting.

Conclusion: Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

Show MeSH
Related in: MedlinePlus