Partial sequencing and phylogenetic analysis of Soybean mosaic virus isolated in Ukraine

The aim of the present study is to compare the biological and molecular properties of Ukrainian soybean mosaic virus (SMV) isolates with those of known strains or isolates from other countries, and to trace their possible origin. The methods of mechanical inoculation, reverse transcription polymerase chain reaction, DNA sequencing and phylogenetic analysis have been used. Results. Five SMV isolates have been collected and biologically purified from breeding plots in Vinnitsa region of Ukraine. It has been found that all these isolates show the same reaction patterns when infecting 11 differential soybean cultivars. Phylogenetic analysis of sequences of the coat protein coding region and P1 coding region revealed strong genetic relationships between representative Ukrainian (UA1Gr) and SMV-VA2 isolates which together were sorted in one clade with G2 strain. The investigation of sequence identity showed that different genomic regions of SMV were under different evolutionary constraints. Conclusions. SMV, isolated in Ukraine for the first time, belongs to the G2 strain group that is widespread in North America. The SMV isolates obtained in this work may be employed in the Ukrainian national breeding programs to create soybean with durable virus resistance.

Introduction.Soybean mosaic virus (SMV) is a member of the Potyviridae family within the genus of Potyvirus, which is the largest group of plant viruses [1].It is the most common and prevalent viral pathogen of soybean (Glycine max (L.) Merr.) worldwide.Symptoms induced by SMV include severe mosaic, mottling, rugosity and necrosis on the leaves of many soybean varieties [2].Following seed transmission (1-68 %) or spread by aphid vector in non-persistent manner, the different SMV strains cause 10-50 % yield losses and seed quality deterioration in many soybean producing areas [2,3].Furthermore SMV may induce much more severe damages of soybean when mix-infected with other virus pathogens as a result of their synergistic infections [2,4].
SMV particles are flexuous filaments approximately 750 nm long and 15 to 18 nm in diameter.Like other potyviruses SMV contains a monopartite, singlestranded, positive-sense RNA genome of about 9.6 kb.There is a VPg (viral protein genome-linked) covalently bound to its 5' end and a poly-A tail at the 3' end.SMV genome encodes one large polyprotein, which is subsequently cleaved at least into 10 mature proteins by virus-encoded proteases [2,5,6].Recently, an additional 25-kDa protein has been discovered in potyviruses.This protein is derived from a frameshift on the P3 cistron [7].
Numerous SMV isolates have been reported all over the world.In the United States, a number of SMV isolates were classified in seven strain groups (G1-G7) based on the symptoms developed in a set of various resistant soybean cultivars (Table 1) [2,8].This diffe-rential system is by far the most recognized and widely used.Similarly, there were five strains (A to E) reported in Japan and 21 strains (SC1 to SC21) -in China [9,10].It is interesting that new SMV isolates capable of overcoming host resistance have been identified [11,12].Different types of reaction of susceptible and resistant cultivars are the result of specific interaction between the soybean R gene product and the virus avirulence (Avr) gene product [13,14].Inheritance studies have shown that in most cases the virus resistance in soybean is controlled by a single dominant gene.Three independent loci (Rsv1, Rsv3, and Rsv4) have been reported for SMV resistance [15,16].
Numerous studies have been undertaken to understand the mechanisms that drive the evolution and geographical distribution of plant viruses.The aim was to unravel phylogenetic relationships among virus isolates as they continue to evolve through genetic exchanges (recombination between different viral RNA molecules) or accumulation of mutations [17,18].As more and more SMV isolates are sequenced, the phylogenetic relationship and molecular variability can be studied.Construction of the first SMV phylogenetic trees for the full-length genome or for its single genes (P1, HC-Pro and Cp) sequences, allowed dividing SMV strains and isolates into distinct phylogenetic groups and subgroups [2,11,19].
In Ukraine, the diseases caused (presumably) by SMV were first reported in 1938.Later, in the early 1960's, SMV was identified on soybean field in eastern and southern regions [20].Since then, contrary to intensive investigation of SMV strain diversity in many countries, in Ukraine, one of the biggest agrarian areas in Europe, no survey was carried out [21].
In the present study we have collected and biologically purified five SMV isolates from breeding plots in Vinnitsa region of Ukraine.We have conducted pathogenicity tests and analyzed phylogenetic relationships based on sequences of the coat protein-coding and P1-coding regions, to compare Ukrainian isolates of SMV with previously known isolates or strains, and to trace their origin.Within 2-3 weeks post inoculation, total RNA was extracted from virus-infected soybean leaves using the Pure Link RNA Mini Kit («Invitrogen», USA).A pair of primers, forward (SMV-CPf; 5'-CAAGCAGCAAA GATGTAAATG-3') and reverse (SMV-CPr; 5'-GTCC ATATCTAGGCATATACG-3'), was used to prime the amplification of a conserved region (the fragment of 469 bp) in the coding region of SMV coat protein (CP) [22].We have also modified and synthesized a pair of primers, forward (SMV-P1f; 5'-AGTCAAATGGCAA CAATCATG-3') and reverse (SMV-P1r; 5'-GGGAGT AGTGCTGAATATCC-3') for the P1 gene amplification (the fragment of 934 bp) according to the conserved nucleotide sequences in the same region of different SMV strains (G2, G1, N, G4, G3, G7) from the Gen Bank [11].A one-step RT-PCR was carried out using both M-MuLV Reverse Transcriptase and Taq DNA polymerase («Fermentas», Lithuania).For RT-PCR amplification, 2 ml of total RNA was added to 48 ml of reaction mixture (10 ml of 10 ´ PCR buffer, 3.5 ml of 25 mМ MgCl 2 , 0.5 ml of 10 mМ dNTP, 2 ml of each forward and reverse primers [10 pmoles/ml], 0.25 ml of each reverse transcriptase [20 u/ml] and Taq polymerase [5 u/ml], 34.5 ml of Н 2 О).Thermal cycling conditions («Bio-Rad», iQ5 thermocycler, USA) were: 1 cycle of 42 °C for 45 min, 1 cycle of 94 °C for 2 min, 35 cycles of 94 °C for 30 s, 55 °C for 30 s, 72 °C for 1 min, and a final extension at 72 °C for 10 min.PCR products were resolved on 1.5 % agarose gel.Sequences were determined directly from the PCR products using the dideoxynucleotide termination method and an ABI Prism 3730 XL DNA Analyzer («Applied Biosystems», USA).All PCR products were sequenced with the primers used to amplify the fragments.The nucleotide sequence data have been submitted to the GenBank database under the following accession numbers: JF431105 (CP region) and JF803911 (P1 region).

Materials and methods. SMV detection, biological purification, RNA extraction, reverse transcription polymerase chain reaction (RT-PCR)
Phylogenetic analysis.The reference strains or isolates of SMV, the sequences of which were retrieved from NCBI (National Centre for Biotechnology Information, USA) database and used in our investigations, are listed in Table 2. Multiple sequence alignments were obtained using ClustalW algorithm (http://www.ebi.ac.uk/clustalw/) [23].Aligned P1 amino acid sequences were visualized and compared using BioEdit sequence alignment editor [24].Nucleotide and encoded amino acid sequences were edited and similarities were analyzed using MEGA v.5.program [25].The phylogenetic relationships of the SMV sequences were analyzed by the NJ and ML algorithms implemented in MEGA v.5.program using WMV as the outgroup (Gen Bank Acc.code EU660580).In NJ analysis, the Kimura's two-parameter model and p-distance model were applied for nucleotide and amino acid sequence analyses, respectively.For ML method, the Kimura's twoparameter model was used with default settings.For the statistical significance estimation of branching, bootstrap values were calculated using 1000 random replications.
Pathogenicity test of SMV isolates.To determine biological properties of SMV isolates, several differential cultivars, including Essex, «Tousan 50», Ogden, Raiden, Marshall, York, PI 96983, Harosoy, V94-5152, PI 264555, and «Suweon 97», were grown and inoculated with each isolate as described above.For independent pathogenicity test, 10 plants of each cultivar were inoculated with each isolate.At the same time 5 plants of each cultivar were inoculated only with buffer (mock inoculation) and used as negative controls.Symptom development was monitored for 5 weeks post inoculation and recorded as described by Chen et al. [2].The seeds of differential soybean cultivars were obtained from USDA Soybean Germplasm Collection, Urbana, Illinois.
Results and discussion.To examine virus infection on soybean plants grown at the breeding plots of Vinnitsa National Agrarian University, field surveys were performed in 2008-2010.Based on the results of these observations we have collected leaf samples from plants of 6 different soybean cultivars («Gribskaya 30», «Dachnyans'ka 1», «Kirovograds'ka 26», Poema, Williams, Syurpriz) showing the most severe viral symptoms, i. e. mosaics, mottling, rugosity and deformation.All collected samples were identified by DAS-ELISA as infected with SMV except one collected from cv. «Dachnyans'ka 1» (data not shown) [21].Field isolates of SMV maintained in the susceptible cv.«Gribskaya 30» were inoculated onto P. vulgaris cv.Topcrop.By the 7 th day, when the necrotic veinal lesions have appeared on primary leaves of Topcrop (Fig. 1, see inset), they were cut from the leaf and repeatedly inoculated onto Glycine max cv.«Gribskaya 30».In such way five biologically purified isolates of SMV have been recovered.
Five SMV isolates obtained after biological purification of field samples were used for pathogenicity   1).
For subsequent molecular genetic studies we have decided to use only one isolate (UA1Gr) of SMV, because the identical pathogenic properties were shown for all investigated isolates.The products of RT-PCR amplification of SMV RNA, isolated from infected plants of soybean cv.«Gribskaya 30», were of the expected size of 469 bp and 934 bp for CP and P1 genome regions, respectively.To verify their viral origin these RT-PCR products were directly sequenced.The resulting sequences were used as a query for BLASTX analysis against NCBI database (http://blast.ncbi.nlm.nih.gov/).The results of BLASTX search indicated that our sequences correspond to nucleotide positions 8625 to 9069 (central region of CP) and 129-1056 (P1 region) in the genome of SMV strain G2.Thus, an identity for the sequences of primers' annealing sites was demonstrated between UA1Gr isolate and reference isolates of SMV from NCBI.
The nucleotide (nt) and amino acid (aa) sequence alignments of the CP region central part and of the whole P1 region were conducted and analyzed by computer-based programs to compare UA1Gr isolate with the known strains of SMV.These alignments showed that the CP nucleotide sequence of the UA1Gr isolate shared 91.5 to 100 % identity with the sequences of other SMV isolates (Table 2).Minimum nt similarity was observed between SMV-UA1Gr and -G3, -G1, -G7, while maximum -between SMV-UA1Gr and -VA2 (Table 2).At the same time, comparison of CP aa sequences showed 100 % identity between SMV-UA1Gr and most of other SMV isolates except G7, G7A and G6H (Table 2).These data indicated that vast majority of nucleotide substitutions in the central part of CP region were synonymous.It is not surprising because the regulation of viral RNA amplification and the requirement to assemble stable virions impose intense purifying selection pressures on the CP sequences [6,19].The alignment of the P1 sequences showed that the similarities for different SMV isolates varied within the range of 87.6-99.3% for nt sequences, and 86-99.4% for aa sequences (Table 2).
It is interesting, that P1 nt sequences of UA1Gr and VA2 isolates were also found to be the most similar to each other.A higher level of aa tolerated variability in the P1 compared to the CP-coding regions suggests a larger number of non-synonymous substitutions occurring in the P1 region.The P1 protein is known as the least conserved region of the entire polyprotein of potyvirus [6,11,18].
Comparison of the aligned N-and C-terminal aa sequences of the P1 protein between UA1Gr isolate and the known strains of SMV showed that aa substitutions were conservatively distributed over the entire coding region.However, significant N-terminal variations were found in aa sequences of G2 strains, particularly between aa positions 92 and 100.As shown in Fig. 2, UA1Gr isolate has aa substitution of Asn by Asp at the position 276 contrary to others.Interestingly, nine SMV isolates (UA1Gr, N, VA2, WS37, L-RB, G4, ChGs2, G2, and WS156) have amino acid deletion at the position 198 from the N-terminus, therefore their P1 protein has only 308 aa, comparing to 309 aa of other strains (Fig. 2).From Fig. 2 it is obvious, that C-terminal region of the P1 protein of WMV isolate is much more similar to the same region of other SMV isolates, contrary to N-terminal region of this protein.These results are consistent with those of the analyses for other potyviruses, which all contain highly conserved residues responsible for self-cleaving protease activity, exactly at the C-terminal region of the P1 protein [2,11,19].We have observed no differences in the proteolytic triad composed of His222, Ser263 and Phe-Val-Val-Arg-Gly between the positions 283 and 287 for all SMV and WMV isolates, except for the substitution of the second Val for Ile in case of Sc6 isolate (Fig. 2).Phylogenetic analysis of the nt and aa sequences conducted for one Ukrainian isolate (UA1Gr) and 21 previously known SMV isolates demonstrated the same general trends observed in the percent identities of the sequences.Using phylogenetic tree for nt sequences of P1 gene reconstructed by the ML method and applying Kimura's two-parameter model (Fig. 3, A), we found that Ukrainian isolate UA1Gr is most closely related to SMV-VA2 and belongs to one clade with G2, G4, WS156, L-RB, N and WS37 isolates (99 % bootstrap).Almost the same result was obtained for aa sequence of the P1 protein, when NJ method with p-distance model was used (Fig. 3, B).The grouping in the trees obtained for P1 was consistent with the previous whole-genome study of SMV [18].Such good separation of SMV isolates into genetically distinct groups can be explained by using the highly polymorphic P1 sequence that is not strictly required for viral infectivity.On the other hand, this protein interacts with a varying set of plant factors during the process of host adaptation; therefore P1 is under strong positive selection [18,19].In contrast, ML tree constructed from the central part of the СP regions provided low separation of SMV isolates and was non-informative because the phylogenetic tree for the combined CP and P1 sequences appeared closely similar to that reconstructed for the P1 sequence alone (Fig. 3, A, C, D).As mentioned above, high conservatism of the CP region is considered to be associated with stronger functional and structural constraints imposed on it [18,19,26].
Our phylogenetic analysis did not show clear relationships between the phylogeny of the isolates and their geographical origin, that can be explained by the recombination events between genomes of SMV isolates [17,18].

Table 2
Identity (%) of P1gene and CP gene nucleotide and amino acid sequences between Ukrainian SMV isolate and known strains of this virustest.All tested SMV isolates showed the same reaction patterns on 11 differential cultivars and, therefore, were classified as one pathotype (data for only UA1Gr isolate are shown in Table1).As shown in