Limits...
Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling

View Article: PubMed Central - PubMed

ABSTRACT

Accurate annotation of protein coding regions is essential for understanding how genetic information is translated into function. We describe riboHMM, a new method that uses ribosome footprint data to accurately infer translated sequences. Applying riboHMM to human lymphoblastoid cell lines, we identified 7273 novel coding sequences, including 2442 translated upstream open reading frames. We observed an enrichment of footprints at inferred initiation sites after drug-induced arrest of translation initiation, validating many of the novel coding sequences. The novel proteins exhibit significant selective constraint in the inferred reading frames, suggesting that many are functional. Moreover, ~40% of bicistronic transcripts showed negative correlation in the translation levels of their two coding sequences, suggesting a potential regulatory role for these novel regions. Despite known limitations of mass spectrometry to detect protein expressed at low level, we estimated a 14% validation rate. Our work significantly expands the set of known coding regions in humans.

Doi:: http://dx.doi.org/10.7554/eLife.13328.001

No MeSH data available.


Comparing the proportion of harringtonine-treated ribosome footprints at the inferred initiation sites of novel coding sequences and the annotated initiation sites of annotated coding sequences.Both plots are an aggregate over the same set of annotated coding genes for which the inferred coding sequence does not match the annotated coding sequence.DOI:http://dx.doi.org/10.7554/eLife.13328.026
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4940163&req=5

fig9: Comparing the proportion of harringtonine-treated ribosome footprints at the inferred initiation sites of novel coding sequences and the annotated initiation sites of annotated coding sequences.Both plots are an aggregate over the same set of annotated coding genes for which the inferred coding sequence does not match the annotated coding sequence.DOI:http://dx.doi.org/10.7554/eLife.13328.026

Mentions: As additional evidence supporting our inferences for these instances, Author response image 2 shows that the aggregate harringtonine footprint signal at the annotated initiation sites is substantially lower than the signal at the initiation sites inferred by our model, for the same set of annotated coding transcripts.10.7554/eLife.13328.026Author response image 2.Comparing the proportion of harringtonine-treated ribosome footprints at the inferred initiation sites of novel coding sequences and the annotated initiation sites of annotated coding sequences.


Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling
Comparing the proportion of harringtonine-treated ribosome footprints at the inferred initiation sites of novel coding sequences and the annotated initiation sites of annotated coding sequences.Both plots are an aggregate over the same set of annotated coding genes for which the inferred coding sequence does not match the annotated coding sequence.DOI:http://dx.doi.org/10.7554/eLife.13328.026
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4940163&req=5

fig9: Comparing the proportion of harringtonine-treated ribosome footprints at the inferred initiation sites of novel coding sequences and the annotated initiation sites of annotated coding sequences.Both plots are an aggregate over the same set of annotated coding genes for which the inferred coding sequence does not match the annotated coding sequence.DOI:http://dx.doi.org/10.7554/eLife.13328.026
Mentions: As additional evidence supporting our inferences for these instances, Author response image 2 shows that the aggregate harringtonine footprint signal at the annotated initiation sites is substantially lower than the signal at the initiation sites inferred by our model, for the same set of annotated coding transcripts.10.7554/eLife.13328.026Author response image 2.Comparing the proportion of harringtonine-treated ribosome footprints at the inferred initiation sites of novel coding sequences and the annotated initiation sites of annotated coding sequences.

View Article: PubMed Central - PubMed

ABSTRACT

Accurate annotation of protein coding regions is essential for understanding how genetic information is translated into function. We describe riboHMM, a new method that uses ribosome footprint data to accurately infer translated sequences. Applying riboHMM to human lymphoblastoid cell lines, we identified 7273 novel coding sequences, including 2442 translated upstream open reading frames. We observed an enrichment of footprints at inferred initiation sites after drug-induced arrest of translation initiation, validating many of the novel coding sequences. The novel proteins exhibit significant selective constraint in the inferred reading frames, suggesting that many are functional. Moreover, ~40% of bicistronic transcripts showed negative correlation in the translation levels of their two coding sequences, suggesting a potential regulatory role for these novel regions. Despite known limitations of mass spectrometry to detect protein expressed at low level, we estimated a 14% validation rate. Our work significantly expands the set of known coding regions in humans.

Doi:: http://dx.doi.org/10.7554/eLife.13328.001

No MeSH data available.