RNA Journal Club 5/14/09
Sarah E. Calvoa, David J. Pagliarinia and Vamsi K. Mootha
PNAS 106 (18): 7507-7512, May 2009.
This week’s incisive summary and analysis by Robin Friedman:
Upstream ORFs (uORFs) generally consist of an AUG codon with an in-frame stop codon preceding the end of the canonical coding sequence (CDS). The uORFs therefore can either be entirely upstream of the CDS or overlapping the start of the CDS. uORFs have been shown to decrease CDS expression in many anecdotal cases, although translation of the CDS can still occur by leaky scanning or re-initiation. Early analysis suggested that <10% of vertebrate mRNAs had upstream AUGs, but more recent computational predictions suggested that >40% of vertebrate genes have uORFs. This study is the first to experimentally address the extent of uORF impact on a genome-wide scale.
The authors constructed a 5′ UTR dataset from refgene annotations, finding that 49% of human and 44% of mouse transcripts have at least one uORF. They next examined high-throughput MS/MS datasets for steady-state protein quantification at a genome-wide level. In each of four datasets, genes that have uORFs have lower protein expression than genes with no uORFs, even after normalizing to mRNA expression. uAUG context, the distance from cap to uORF, uORF conservation, and the number of uORFs all affected this difference in protein expression, whereas uORF length and distance from uORF to CDS did not.
While the previous experiments show that uORF-containing genes have lower steady-state protein levels, they do not show a direct effect of uORFs on translation. To test directly whether the uORFs affect translation, the authors created reporters with the 5’UTRs from randomly selected genes containing uORFs fused to luciferase. Compared to a single-nucleotide-mutant that removes the uAUG, the luciferase activity was reduced ~50% in five randomly selected mouse genes, while the mRNA level, assayed by qPCR, was mostly unchanged. For 10 mouse genes with MS/MS and conservation support for functional uORFs, the luciferase reporters showed 50-80% repression at the protein level.
Asking whether the uORFs could be involved in human polymorphism and disease, the authors queried dbSNP and the human gene mutation database for mutations that create or destroy uORFs. There are 509 genes with polymorphic uORFs, and 14 with recorded mutations linked with disease. Five of the polymorphisms were tested by qPCR, and the uORF was found to repress protein levels by 30-60%, while five of the disease-causing uORF mutations were found to repress by 70-100%.
This paper convincingly argues that uORFs are widespread in humans and have a widespread impact on protein expression. Much of this impact is likely conserved and functional. In addition, they provide interesting experimental support for the fact that uORFs typically repress at the translational level as opposed to through NMD and that CDS translation downstream of uORFs likely proceeds from leaky scanning rather than from re-initiation. While the mechanism has not been elucidated on a genome-wide scale, this paper provides an refreshing look at an often-ignored but important contribution to translational control.