Date Published: June 29, 2017
Publisher: Public Library of Science
Author(s): Yizhou Liu, Joshua S. Sharp, Duc H-T. Do, Richard A. Kahn, Harald Schwalbe, Florian Buhr, James H. Prestegard, Hans-Joachim Wieden.
Mistakes in translation of messenger RNA into protein are clearly a detriment to the recombinant production of pure proteins for biophysical study or the biopharmaceutical market. However, they may also provide insight into mechanistic details of the translation process. Mistakes often involve the substitution of an amino acid having an abundant codon for one having a rare codon, differing by substitution of a G base by an A base, as in the case of substitution of a lysine (AAA) for arginine (AGA). In these cases one expects the substitution frequency to depend on the relative abundances of the respective tRNAs, and thus, one might expect frequencies to be similar for all sites having the same rare codon. Here we demonstrate that, for the ADP-ribosylation factor from yeast expressed in E. coli, lysine for arginine substitutions frequencies are not the same at the 9 sites containing a rare arginine codon; mis-incorporation frequencies instead vary from less than 1 to 16%. We suggest that the context in which the codons occur (clustering of rare sites) may be responsible for the variation. The method employed to determine the frequency of mis-incorporation involves a novel mass spectrometric analysis of the products from the parallel expression of wild type and codon-optimized genes in 15N and 14N enriched media, respectively. The high sensitivity and low material requirements of the method make this a promising technology for the collection of data relevant to other mis-incorporations. The additional data could be of value in refining models for the ribosomal translation elongation process.
It is well recognized that translation of mRNAs to the polypeptides of functioning proteins is a carefully controlled process with promoters binding to sites upstream of coding sequences, the recruitment of numerous effector proteins to the ribosomal surface and post-translational modification of some of these proteins [1, 2]. Less appreciated is the control that may be exerted by the use of different codons for a given amino acid and variations in the availability of tRNAs that bind to these codons. We encountered a consequence of this variation in the course of expressing a eukaryotic protein in a bacterial host for NMR structural studies, namely that, in the absence of adequate supplies of complementary tRNAs, mistakes in translation are made; in our case, addition of a lysine at sites where an arginine belongs. This phenomenon has been observed previously, as it leads to extra crosspeaks in the 1H-15N 2D NMR spectra commonly used as a structural fingerprint of the protein studied , and sometimes to incorrect attribution of these peaks to alternate conformational forms. What is striking in our observation is that the frequency of mistakes at different positions in the coding sequence varies, even when the rare codons for the arginine to be added are the same. This suggests additional sequence dependent control of translation and the possibility that examination of the frequency of mistakes could shed light on control mechanisms. We report here data on the frequency of mistakes and examine possible mechanistic explanations.
The first evidence of lysine substitution for arginine came from the mass spectrum of the intact protein expressed using the native (wild-type) sequence (see Fig 1). The electrospray ionization (ESI) spectrum of this product is presented in Fig 2. Wild-type yARF1 (expected mass 21463.4 Da) reveals 3 major peaks at 21407 Da, 21436 Da, and 21464 Da. The ~28Da mass difference corresponds to that between arginine and lysine. As WT yARF1 contains 9 arginine codons (AGA) that are rare in E. coli, a ladder pattern with separations of 28 Da is expected anytime lysine for arginine substitutions occur. Interestingly, fitting the 3 peaks in the WT yARF1 spectrum, based on a binomial probability mass function, failed to reproduce the experimental peak height ratios. Using a uniform 4.8% substitution probability, which matches the intensities of zero and one substitution peaks, the ratio of the one substitution to the two substitution peak is predicted to be 4.6. The observed ratio is 2.5, suggesting that the 9 arginine rare codon sites might have different levels of mis-incorporation. The 28 Da ladder pattern is qualitatively reproducible, but levels of substitution do seem to depend on growth conditions, with media in which amino acids are directly supplied showing lower levels of substitution. This phenomenon is worthy of further investigation. Also, because substituted and non-substituted proteins could have different ionization efficiencies in the above mass spectrometry analysis, the apparent sequence specific substitution in samples produced in minimal medium would benefit from a more quantitative analysis.
The data presented show significantly enhanced lysine substitution at three of the nine arginine rare codon sites, R72, R96, and R103. There seems to be no trivial explanation for experimental variation due to properties of the substituted proteins or detectability of derived peptides. Therefore, the results must reflect, in some way, mechanistic aspects of the translation process. Following kinetic models such as those introduced by Hatzimanikatis [19, 20] or Lipowsky , and the specific kinetic factors suggested by work of the laboratories of Rodnina , Green  and Eherenberg , we can suggest several ways that levels of mis-incorporation at different sites could vary. This could be the result of a direct effect on binding free-energies or kinetic activation energies by ribosomal interactions with adjacent parts of the mRNA or nacent polypeptide, or it could be the result of sequence specific slowing of one of the steps at, or following, GTP hydrolysis, which would minimize the effect of this secondary selection step. Among possible sources for sequence specific slowing are: peptide bond formation by the preceding amino acid, limitations on the transit of the nacent peptide through the ribosomal tunnel, the slow progression of polysomal synthesis where a ribosome at a more C-terminal site stalls movement of ribosomes at more N-terminal sites, or the need to unravel mRNA secondary structure toward the 3’ end to allow progression of a ribosome down the sequence. The local concentration of the appropriate tRNAs could also be depleted in regions where a particular rare codon is clustered allowing inappropriate tRNAs to more effectively compete for an initial binding site. We can eliminate some of these possibilities based on the experiments conducted and examination of the yARF1 sequence.