Limits...
Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences.

Ivanov IP, Firth AE, Michel AM, Atkins JF, Baranov PV - Nucleic Acids Res. (2011)

Bottom Line: We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences.Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes.In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

View Article: PubMed Central - PubMed

Affiliation: BioSciences Institute, University College Cork, Cork, Ireland. iivanov@genetics.utah.edu

ABSTRACT
In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

Show MeSH
Boxplots of non-AUG CDS extension length distributions for previously known cases and those identified in this study.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3105428&req=5

Figure 5: Boxplots of non-AUG CDS extension length distributions for previously known cases and those identified in this study.

Mentions: The total size of all 42 newly identified extensions is 3374 codons; the average size is 80.3 codons and the median is 51 codons. In the 17 known cases that passed the qualitative test, the average extension is 87.7 codons and the median is 65 codons. Among the newly identified non-AUG-initiated extensions, the shortest is 17 codons and the longest is 300 codons. Among the 17 known and conserved non-AUG-initiated extensions, the shortest is 15 codons and the longest is 235 codons, see Figure 5 for distribution of extension lengths. The annotated sequences of mRNAs described in Table 1 are available in the Supplementary Dataset 1.Figure 5.


Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences.

Ivanov IP, Firth AE, Michel AM, Atkins JF, Baranov PV - Nucleic Acids Res. (2011)

Boxplots of non-AUG CDS extension length distributions for previously known cases and those identified in this study.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3105428&req=5

Figure 5: Boxplots of non-AUG CDS extension length distributions for previously known cases and those identified in this study.
Mentions: The total size of all 42 newly identified extensions is 3374 codons; the average size is 80.3 codons and the median is 51 codons. In the 17 known cases that passed the qualitative test, the average extension is 87.7 codons and the median is 65 codons. Among the newly identified non-AUG-initiated extensions, the shortest is 17 codons and the longest is 300 codons. Among the 17 known and conserved non-AUG-initiated extensions, the shortest is 15 codons and the longest is 235 codons, see Figure 5 for distribution of extension lengths. The annotated sequences of mRNAs described in Table 1 are available in the Supplementary Dataset 1.Figure 5.

Bottom Line: We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences.Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes.In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

View Article: PubMed Central - PubMed

Affiliation: BioSciences Institute, University College Cork, Cork, Ireland. iivanov@genetics.utah.edu

ABSTRACT
In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

Show MeSH