Computational approaches for parameterization of aminoacyl-tRNA synthetase substrates

A. Rayevsky, M. Tukalo © 2018 A. Rayevsky et al.; Published by the Institute of Molecular Biology and Genetics, NAS of Ukraine on behalf of Biopolymers and Cell. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited UDC 577


Introduction
This paper is focused on detailing some methods related to the understanding of the editing mechanism provided by prolyl-tRNA synthetase from E.faecalis and explanation of the preparation of aminoacyl molecule. In funda-mental aspect the problems are associated with the realization of genetic information. In applied research on the antibiotics and drugs development, aminoacyl-tRNA synthetases (aaRSs) and tRNA are actually attractive targets. It is important to study RNA derivatives, where it is necessary to solve the problem of Bioinformatics ISSN 1993-6842 (on- wrongly included minor bases of nucleic acids, or at the stage of protein biosynthesis, when erroneously incorporated amino acids, like alanine, cause incorrect protein folding [1]. It is a very tedious work preparation of the modified residue, especially if a molecule of interest comprises different chemical classes like aminoacyl, which is represented by ester of nucleic acid base and amino acid residue. However, the fundamental research referring to a certain biological question has given rise to another problem, which forced us to figure out how to prepare a non-canonical residue for in silico modelling. There are many different methods to carry out a force field parameterization for molecular dynamics (MD) simulations, especially for single small molecules, by means of in-built software like Ambertools, developed for the library modification, or online servers like ProDrug [2] or ParamChem [3]. It is also known how to deal with generation of post translational modifications like acetylation or methylation of protein residues, the parameters of which can be offered directly by force field developers. It is evident that the problem of force field content and its diversification is very important for modern in silico studies.
However, some difficulty which many researches are faced to is parameterization of hybrid molecules, implemented into the chain and possessing non-integer net charge, like aminoacyl-tRNA 3'-terminal residues. As we used an Amber99 force field [4] implemented in Gromacs package [5] we should abide by the rules of its charge distribution and atom typing. Due to the capping of terminal nucleotides, chain-terminal nucleotide residues normally have non-integral charges, but the 5' and 3' charges sum to an integer. The reason for such force field parameterization is a high charge density on the phosphate group. Thus, to obtain a topology, which wouldn't stand out of the amber99 parameterization method, a non-integer charge of the 3' end should be preserved taking into account, that amino acid block should be protonated in accordance to the N-terminal amino acid type.
The electrostatic potential (ESP) is an important characteristic of the molecule of interest and a powerful tool for generation of realistic charge model [6]. Its more advanced algorithm of charge assignment, called RESP (Restrained Electrostatic Potential), allows deriving partial charges, based on the atomcentered point charge model common for all members of amber force fields [7].
In the previous paper, we have combined the experimental biology approach with the ab-initio quantum methods and molecular dynamics simulations to understand the putative mechanism editing aminoacyl molecule formed with the incorrect amino acid and 3'-terminal adenosine of tRNA, whereas this work is aimed to reveal some details of the modeling process. Some features of the substrate of the editing domain (INS domain) of prolyl-tRNA synthetase allowed the performing of a deep study of the parameterization process, the development of a validated enough modeling method and further applying it to other systems.

Materials and Methods
While calculating the aminoacyl-tRNA fragments' charges we started from the docking coordinates of alanyl-tRNA bound to the prolyl-tRNA synthetase (PDBID:2J3M) [8]. The protein structure was prepared and reconstruct-ed on the basis of the deposited X-ray (2J3M) to rebuild a full-length structure without gaps, using Swiss-model server [9]. The procedure of ligand-protein and nucleic acid-protein docking protocol has been already highlighted in the previous study [10]. A significant attention was paid to the torsion angle "CT-OS-C" of the aminoacyl-tRNA molecule, which is not a part of the AMBER99 force field but plays an important role in an excessive flexibility of the molecule during MD.
One of the most distinguishing features of the alanine-proline aminoacyls, the effect of which will be described later, is a small size of amino acid moiety and a low flexibility of the radical group. At the step of parameterization it also plays an important role. Following the mechanism of charge assignment and estimation of the best one, we prepared two sets of molecules based on terminal adenine block with either methyl, dimethyl, glycine or acetate groups bound to O3' of ribose (modified RA or RAM), and an amino acid block with either OH, OMe, or etheric bonded methylethyl, bound to the carboxyl oxygen of amino acid (or +NH3CHXCO-Z), where X defines an amino acid's radical. M and Z are the formal substituted groups to be explicitly removed from the molecule together with a specific non-integer charge assigned to the group after application of a charge constraint (inter-molecular charge constraints and intra-molecular charge constraints) with the R.E.D.III algorithm.
All blocks used here were optimized using the common for amber force fields HF/6-31G** theory level [11] and Gamess package [version 7.1.5]. A Rigid-Body Reorientation Algorithm (RBRA) implemented in R.E.D. was applied for each block to provide the re-producibility of the atomic charge values before the charge fitting step. The molecular fragments were constructed by setting INTERand INTRA-MCC to determine a more realistic charge distribution.
All MD simulations were run using the Gromacs (ver 4.0.5) program and the amber99 force field [12]. To ensure that tRNA conformation is stable a long MD simulations of 50 ns for ProRS (from E.faecalis) complex with a correspondent tRNA molecule were carried out. A positional constraint was also applied to the group of 3'-CCA atoms and the binding site residues for the first 20 ns to avoid unreliable bends of the tRNA stem. Thus, the complex consisting of the protein and tRNA Pro was adjusted by a harmonic function using a force constant of 500 kcal/mol per A 2 . In order to provide a smooth switch from steered MD to a free MD, the force constant was reduced to 250, 125, 50, 10 and 5 kcal/mol per A 2 in six MD simulations each of 250ps. After that, a free simulation of 5 ns MD was run at 310 K, using PME (Particle-mesh Ewald) method for calculation of long-range interactions. Thermostat v-rescale for temperature coupling and Parrinello-Rahman coupling algorithm were used to ensure a harmonization between temperature and pressure. The Coulomb cutoff radius of 1.1 nm for electrostatic and the Lennard-Jones interactions cutoff radius of 1.1 nm were applied.

Results and Discussion
The aminoacylation site of any aaRS is responsible for the phosphoester bond formation between phosphate from AMP and carbonyl from amino acid. The next step, namely transfer of amino acid moiety to the 3'-CCA end of tRNA molecule, is called aminoacylation and takes place in the same place. This process results in a bond rearrangement and connection of 2'OH group on ribose of 3'-terminal adenine base to a carbonyl carbon of amino acid molecule. In ProRS a more precise editing domain (for example, INS domain of ProRS) exists and can hydrolyze the incorrectly aminoacylated (with alanine residue, instead of proline) tRNA to avoid the inclusion of non-cognate amino acid into the proteins.
To compute missing atomic charges for non-parameterized compound, we used a method of common for Amber force field. This procedure required an initial HF/6-31G** quantum-chemical computation of our hybrid molecule. The MEP was derived from the calculations at the HF/6-31G** level with the Gamess package [14]. To avoid the issue, when unphysical high charges are generating on atoms [15], and to demonstrate the importance of correct charge mapping we created and validated another protocol for such non-standard residues using AnteRed from R.E.D.III server. To solve the problem of the energy minimum conformation the Rigid Body Reorientation Algorithm, which is implemented in R.E.D. III server options, was applied to the structure of interest just before the MEP calculation and getting reproducible RESP charges. This step was very important for the subsequent molecular dynamics simulation.

Parameterizing a novel residue
All charges for the aminoacyl-tRNA fragment were computed using the RESP method from a single docking output conformation after implementation of RBRA. To estimate the quality of the predicted topology a relative root-mean-square (RRMS) protocol, implemented into the R.E.D.III server was applied to compare the ESP derived from the molecule in the presence and absence of charge constraints (INTER-MCC and INTRA-MCC) before the molecular dynamics run [13].
To meet the amber99 force field requirements we used the approach of charge distribution assignment, when the terminal RA3 fragment is replaced with a RAM and the amino acid (alanine) is represented as a terminal link in the chain. At the same time RAM is a single internal adenosine with a total net charge of '-1', the charge distribution of which is also undergoing some transformation. During the charge fitting procedure RAM is represented as an M-substituted RA3 molecule, where M is either methyl, dimethyl, glycine or acetate group bound through the O3' position of the ribose. Such construction together with application of INTRA-MCC (intra-molecular charge constraint inside a single molecule) provides another charge redistribution compared to a standard state of RNA's adenosine, which suits our model with aminoacid bound to tRNA 3'CCA termini. Finally, RAM residue passed application of INTER-MCC (intermolecular charge constraint between different molecules) charge assignment of '-0.6919' to the M-group, that tends to perpetuate the charge of the output RAM residue, after explicit removal of the M-group, which is still equal to '-1', but supposes some rearrangements of those partial charges belonging to O3' and [the] nearest atoms of the source RA3 residue. The procedure resulted in obtaining the almost unified charge model for the 3'-terminal nucleotide with significant changes regarding only the atoms involved in the amino-acyl ester formation, especially C2'-O3', whereas the general model of charge distribution and the total charge of the adenine base were not strongly affected. With regard to alanine (+NH3CHXCO-Z) parameterization it was executed just in the same way with the general exception that INTRA-MCC charge value, applied to Z group, was '-0.6919' and the output charge for the residue became '+1.3081'. There is an example of the full net  Table 1. The table represents original and changed partial charges for RNA and DNA types of adenosine and N-Alanine, which were used for generation of alanyl-aminoacyl tRNA molecules with or without 2'OH group, Red font marks those initial charges, affected during charge fitting, and blue font corresponds to the output partial charges after the procedure of INTRA-and INTER-MCC. Black frame marks the atom which is missing in the output topology.
charge scheme for a short tRNA with alanine added to the last adenosine: RA5 + RX + RX + RAM + '+NH3CHXCO' = -0.3081 + (-1) + (-1) + (-1) + 1.3081 = -2 The subsequent relative RMS error estimation procedures (RRMS) on the server gave reasonable values of 0.046-0.068, while QM and derived charges were compared, and high-er values of 0.08-0.95 were obtained, when the results were compared against unconstrained (MCC) RESP calculation. This is a standard protocol to verify reliability of the output structures -the lower value, the best result. However, the most accurate and illustrative estimation of the model is [an] implementation of the newborn topology in the MD simulation. Graphics and pictures represent an interaction energies (Electrostatic -A and Van der Waals -C) and substrate geometries (RESP prepared substrate in B and simple concatenation of parameters from amino acid and nucleic topology databases in D), which confirm[s] a better compliance of RESP model with experimental data, described in the paper [10]. It is evident that a decrease of electrostatic interaction energy value during the MD is caused by an incorrect turn of alanine radical and loss of its hydrogen bonds and disappearance of water molecules (sterical repulsion), which are responsible for nucleophilic attack and hydrolysis.
There was no significant difference between single-point energy calculation of multi-conformational (or orientations offered with RRBA), for amino acid part of the aminoacyl. Compact and non-polar substituents, like 1-methylethyl ether and OMe, gave the smallest error for amino acid-based charge fitting. In turn, acetate and glycine substituted nucleo tides showed the lowest RRMS values and best results in MD.

Application of the method in a fundamental study
To study the molecular mechanism of the editing reaction, the structure of ProRS from E.faecalis in complex with tRNA Pro was remodeled and validated in appropriate way [10]. We studied a prolyl-tRNA synthetase (II class) system, which possess[es] an editing activity against misactivated alanyl-tRNA via INS domain. Along with the RESP charge generation procedure we used another widely employed approach, which violates many concepts of the force field structure, namely simple splicing of topologies of nucleic and amino acid residues, without any fittings and corrections. In both cases a simulation run was started from the same initial state, obtained from the docked ligand and preliminary optimized structure.
From the Figure it becomes evident that RESP charge gives much more accurate and reliable results, which correlate with the experimental data [10]. The molecular dynamics of ProRS system also aimed at investigation of the 2'OH group impact and showed the conformity between biochemical study and computational calculations (not shown here). Prior to the QM calculation of deoxynucleotide, a range of MD simulations predicted the behaviour of both natural alanyl-tRNA with 2'OH group and the modified with 2'H atom (generated from DA3 residue, which misses a whole 2'OH). The correct parameterization also allowed the explanation of the influence of K279 residue on the substrate specificity, as Lys279 interacts with the O2 atom of C75 and the 2′OH group of C74, fixing the CCA end of the tRNA. This model was more realistic than those of Kumar et al. [16], which suggested that CCAterminal possessed an extended conformation and stacking interaction between A76 and C75. Our model was used for the QM calculations of all predictable reaction mechanisms. The approach used allowed us to perform the direct identification of the transition state of the reaction pathway and to support a new mechanism of hydrolysis of misacylated Ala-tRNA Pro [10].

Conclusions
In general the method of charge fitting is accurate and applicable, however it strongly depends on some starting point for equilibration. The greatest advantage is the fact, that topologies with RESP parameters are ready for straight implementation in a force field without any other refinements. However, the best result can be obtained after comparison of computational and experimental approaches, like it was done with prolyl-tRNA and leucyl-tRNA systems [10,17,18], to correct some inconsistency and set up the system.
The method for preparation of aminoacyl molecule proposed here can be useful in studying the properties of aminoacyl-tRNA at different stages of the translation process. Moreover, it is extremely important in cases where non-proteinogenic amino acids are being studied. The examples of non-proteinogenic include D-amino acids and other precursors and products of the genetically encoded protein