Homology modeling of structure of NH 2-terminal module of mammalian ( Bos taurus ) tyrosyl-tRNA synthetase

Mammalian tyrosyl-tRNA synthetase (TyrRS) is composed of two structural modules: the NH2-terminal catalytic core and cytokine-like COOH-terminal module. In order to elucidate the structural bases for the N-module functions we have used the computational prediction of its three-dimensional (3D) structure by comparative modeling approach, A model of the bovine TyrRS N-module represents the Rossmann nucleotide-binding fold (RF) which is linked to the α-helical domain (aHD). The RF domain forms a single β-sheet containing δ parallel and one attached antiparallel β-strands surrounded by a-helices. The connective polypeptide, CPl, inserted between β3and β4-strands of the RF domain is perturbed from the domain core. Comparative analysis of the multiple sequence alignment of known TyrRSs and the obtained model structure reveals the conservative surface elements, which could potentially form the tRNAbinding surface. This putative surface includes some exposed amino acid residues of CPI, e. g. essential LysJ46 and Lysl47 residues, which were identified earlier by site-directed mutagenesis.

As an approach to elucidate the structural bases of N-module functions we have used the computational prediction of its three-dimensional (3D) structure by the comparative modeling approach.3D structures of three eubacterial TyrRSs and one tryptophanyl-tRNA synthetase (TrpRS), determined experimentally [9][10][11][12], were used as structural templates for the homology modeling procedure.Materials and Methods.Sequence of BtTyrRS was deposited to GenBank/GenPept and Swiss-Prot databases (accession numbers AAC82467, Q29465).Search for homologous sequences was performed with iterative PSI-BLAST service (this and other programs and servers are described in our review [13]).TyrRS of 15 eukaryotic organisms (Bos taurus, Homo sapi-ens^ Mus musculus, Fugu rubripes, Drosophila mela-Itogaster y Anopheles gambiae, Caenorhabditis elegans, Pneumocystis carinii, Saccharomyces cerevisiae, Schizosaccharomyees pombe, Encephalitozoon Cunieuli y Candida albicans, Plasmodium falciparum, Arabidopsis thaliana and Nicotiana tabacum) and 17 archaebacteria were analyzed.Multiple sequence alignment was carried out with Clustal W and secondary structure elements were predicted with the PHD and the multiprediction the NPS@ server.The coordinates files were obtained from PDB for crystallographic structures of three eubacterial TyrRSs: Thermus thermophilus (PDB ID codes 1H3E, 1H3F); Bacillus (Geobacillus) stearothermophilus (2 T S1, 3 T S1, 4TS1); Staphylococcus aureus (1JII, 1JIJ, 1JIK, IJIL); and tryptophanyl-tRNA synthetase from B. stearothermophilus (1D2R, 1I6K, 1I6L, 1I6M).Thre-ading servers Bioinbgu and FUGUE [14] were used to search for the similar structures» The complete N-module model was built using Modeller 6.2 program suite [15].Total 20 initial models from Modeller were built and optimized using «refine-5» protocol with default parameters for optimization procedures.Energy calculation of the models has been performed using Energy option of Modeller.
Secondary structure assignment was done using DSSP and STRIDE algorithms.Search for structurally related domains and their superimposition (both in pair and multiple structure alignment modes) were carried out using CE and VAST.The SCOP server 1.61 release was used to find other similar structures from related families and superfamilies.Quality of the optimized models was verified using Evall23D, ANO-LEA and ERRAT servers and the best model was chosen for the subsequent optimization.The models optimization with a simulated molecular dynamics (MD) approach was carried out with Gromacs 3.1.4program [16].
The model structures were placed into a periodic rectangular box, filled with SPC water layer.The minimal distances between the protein and boundaries of the box were 0.7 nm.Energy minimization was performed on the proteins using a steepest-descent algorithm with 2 fs integration time step and a tolerance of 0.01 kJ • mof 1 • nm"" 1 during 10 ps.Position restrained MD was done to distribute water molecules, and actual MD was simulated during 1 ns.The temperature was controlled by coupling to an external bath of 300 K with coupling time constant of 10 fs.The GROMACS force field was used in this work.
Solvent accessible surface areas of the proteins were calculated using GetArea 1.1 server.Pockets and cavities search and analysis was performed using castP.Distribution of surface conservative residues was done using ConSurf 2.0.Structure visualization and analysis were performed with Deep View (Swiss-PDB Viewer) 3.7b2 and Protein Explorer 1.299.
Results and Discussion.The iterative PSI-BLAST search for the M1-P344 segment of ^iTyrRS has found 32 homologous sequences of the N-modules from other eukaryotic and archaebacterial TyrRSs, and about 30 sequences of TyrRSs and TrpRSs homologous to its aHD domain.The N-module sequences can be divided into five parts according to their distinct substructures: the N-subdomain (residues M1-E33); the first half of Rossmann fold (R34-Kl 19); CPl insertion (G120-L177); the second half of RF with junked «KMSSS» catalytic loop (K178-L235); and aHD domain (D236-P344).The highest local homology between ^TyrRS and 77TyrRS is in the N-subdomain and within L-tyrosine binding H3 α-helix and «KMSSS» loop.
We have modeled the £/TyrRS N-module, containing 340 aa residues (L5-P344) with molecular mass of 38.5 kDa.The 3D structure of the most similar (69 identical residues for M1-P344 region, 22.5 % identity) T. thermophilus TyrRS (PDB code 1H3E), determined with 2.9 A resolution, was used as the best template structure.The target/template pair sequence alignment has been extracted from multiple sequence alignment and corrected manually to optimize the positions of insertions.The localization of insertions/deletions was also deduced from inspection of the template secondary structure elements.As the most significant errors in models are often due to misalignment of target and template sequences, we performed careful manual editing of the alignment, as well as the iterative realignment and model building.There are eleven indels in the BfTyrRS N-module in comparison with the template, and their exact locations may vary slightly.There are only four short insertions within the RF domain of BtTyrRS ( 106 ESIG, Y129, 148 AG and E227) and seven short insertions in the template 7YTyrRS structure.
Unlike the RF domain, a weak sequence homology between the aHD domains of eu-and prokaryotic TyrRSs reveals their significant divergence after the separation of eukaryotes and archaebacteria from eubacteria.Different threading methods allowed us to select the B. stearothermophilus TrpRS structure (PDB code 1D2R, chain A) as an alternative template for the £/TyrRS aHD modeling because of its higher structural similarity.For example, sequence-structure search with Bioinbgu server predicts BsTrpRS as better template than ^TyrRS (Z-scores are 39.0 and 4.3 respectively).That sequence-structure alignment gives only two insertions ( 258 NGVLAFIRHVL and 302 EV) within the bovine sequence compared to the ^TrpRS aHD domain.We have used the BsTrpRS aHD domain (K192-D297 region) as an alternative structure template for the aHD structure modeling.The initial models of the aHD domain were built with the Modeller program and connected with the RF domain into complete two-domain N-module structure by overlapping their common KMSSS-loop segments.The best from 20 initial models were optimized in water environment using restricted molecular dynamics simulation techniques with the Gromacs 3.1.4program.
To test the validity of the initial and optimized models, a combination of evaluation criteria was used to discriminate between the correct and incorrect models.For the models verification such criteria were used as interatomic clashes, stereochemical properties (bond lengths, angles and dihedral angles, peptide bonds planarity, C a tetrahedral distortion etc.), position of residues in the Ramachandran plot, rootmean-square deviation (RMSD) for C a -C a atoms of model/template structure pairs.Using several evaluation servers we analyzed the following structures: 1) the templates (1H3E and 1D2R); 2) initial and refined models of the N-module obtained from Modeller; and 3) the optimized best models obtained by molecular dynamics simulation using the Gromacs program.ANOLEA server performs energy calculations on a protein chain, evaluating the non-local environment of each heavy atom in the molecule.The energy of each pairwise interaction in this non-local environment is taken from a distance-dependent knowledge-based mean force potential.Server ERRAT analyzes the statistics of non-bonded interactions between different atom types, and a single output plot gives the value of the error function vs. position of a 9-residue sliding window.The error values give confidence limits that are extremely useful in making decisions about model reliability.Regions of candidate protein structures that are mistraced or misregistered can then be identified by analysis of the pattern of nonbonded interactions from each window.
The best model obtained after MD simulation represents a two-domain protein with solvent accessible surface area 16.3A 2 and 41.7 A 3 volume (Figure).The RMSD between the model and template structures is 1.65 A for 756 backbone atoms.
A number of analyses were carried out to predict the elements responsible for tRNA binding ability of N-module.The RF and aHD domains are arranged into the N-module forming common tRNA binding surface, which is located on the same side for all dimeric «minimal» TyrRSs and TrpRSs.The comparative analysis of multiple sequence alignment of known TyrRSs and the obtained model structure reveals the conservative surface elements, which could potentially form the tRNA Tyr -binding surface.This putative tRNA-contacting region includes CPl (residues T121-L177) inserted into RF domain.We have analyzed the solvent exposed amino acid residues.The most exposed residues in the CPl of Z?jTyrRS N-module are: D122, L125, K127, E128, L131, Y134, R135, S137, S138, T146, Q142, H143, K146, K147, K154, Q155 and V156.
The secondary structure elements were defined with the DSSP and STRIDE programs.The RF domain of the BtTyrRS N-module (residues Ml-L235) forms a single β-sheet containing 5 parallel and one attached antiparallel ^-strands arrange as (βΟ)-β5-β4-βΙ-β2-β3 and surrounded by α-helices.The /?-sheet adopts additional structural elements.The RF Accessible surface view of the model structure for N-module of bovine tyrosyl-tRNA synthetase.The N-module is shown from the side of its active site cavity, where catalytic KMSSS loop corresponds to conservative dark-colored amino acid residues.Two domains of N-module are the Rossmann fold (left) and the α-helical domain (right).Colors of surface residues correspond to their conservativity obtained from the multiple sequence alignment analysis domain includes the additional N-terminal segment of the first 33 residues containing α-βΟ-β element, which is characteristic of all known TyrRSs but is not homologous to TrpRSs and other class I synthetases.The characteristic CPl insertion (residues G120-L177) is located between β3-and /34-strands and perturbs from the core of the RF domain.
A detailed understanding of the protein function requires the identification of some conserved amino acid residues at the protein surface, which may be responsible for the protein function.The ConSurf server was used to identify such important regions, based on the phylogenetic relations between homologous proteins from their multiple sequence alignment.Conservativity of the exposed amino acid residues of the eukaryotic-type N-module is represented on Figure as color-coded surface of the ZOTyrRS N-module.The majority of exposed (more than 50 %) and strongly conservative residues are localized in and around the described surface region.
The functional role of lysine residues K146 and K147 located in CPl of ^TyrRS has been previously studied by site-directed mutagenesis [17].The replacement of both residues with Asn and Tyr, respectively, as well as the substitution of K147 alone, caused the inactivation of mutant TyrRS in the tRNA Tyr aminoacylation reaction.In our model both Lys residues are exposed and may form contacts with tRNA Tyr .submitted, the article of Yang et aL, (2002) [18] was published where crystallographic structure of a human «minimal» TyrRS was described with resolution 1.18 A.