Phylogenetic analysis of Ukrainian seed-transmitted isolate of Soybean mosaic virus

L. T. Mishchenko, A. A. Dunich, I. S. Shcherbatenko © 2018 L. T. Mishchenko et al.; Published by the Institute of Molecular Biology and Genetics, NAS of Ukraine on behalf of Biopolymers and Cell. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited UDC 578.864/578.32


Introduction
Fourteen of 35 economically important viruses and viroid species are aphid-transmitted and, among these, ten belong to the potyvi-ruses [1]. In this list Soybean mosaic virus (SMV) is present too.
The SMV seed transmission rate is 0-43%. As with other members of the Potyviridae, the efficiency with which SMV is transmitted

Genomics, Transcriptomics and Proteomics
through seed is dependent upon the strain of virus analyzed, the genotype of the host and the time of infection [2][3][4]. Recent studies [5] showed that SMV transmission occurs via infection of embryo. But it was also revealed that SMV is present in all seed parts: seed coat, radicle and cotyledon -23%, 18% and 33%, respectively. For the virus to be transmitted through seed, it must infect embryos and survive during seed germination.
The genetics of viruse seed transmission has not been studied enough. It is known that the hosts resistance to the seed transmission of BSMV is controlled by a single recessive gene. In contrast, the seed transmission of PSbMV and Alfalfa mosaic virus is controlled by multiple genes in a quantitative manner. The study on the viral and host determinants of the strain-specific transmission of SMV through seed has started only recently. So, it was revealed that the CP sequences are required for the transmission of SMV through seed [3]. The highest nucleotide divergence is noted for the P1 gene (involved in host adaptation) of potyviruses and the P3 gene (experimentally verified as the SMV virulence determinants with the HC-Pro and CI proteins) [6][7]. In contrast, the CP gene of SMV, like many other potyviruses, is more conservative [6,[8][9][10]. But recently it was shown that SMV is highly replicated in the developing seed. Several single nucleotide variations (SNVs) in different regions of genome of the seed-transmitted SMV were found [11]. Moreover, it was found that only a singleamino-acid change near the C terminus of the CP of certain SMV strains led to the impossibility to seed transmission [3] that testifies to the involving of the CP gene sequences into theseed transmission of Soybean mosaic virus.
So, the aim of the study was to perform phylogenetic analysis of the CP gene region of the Ukrainian seed-transmitted SMV isolate.

Molecular analyzes
Total RNA was extracted from fresh leaves using Genomic DNA purification kit (Thermo Scientific, USA) following the manufacturer's instructions.
Two step RT-PCR was performed. The reverse transcription was performed using RevertAid Reverse Transcriptase -genetically modified MMuLV RT (Thermo Scientific, USA) according to the manufacturer's instructions. Specific oligonucleotide primers to part of SMV CP gene were used: SMV-CPf: 5'-CAAGCAGCAAAGATGTAAATG-3') and SMV-CPr: 5'-GTCCATATCTAGGCATA-TACG-3' [12]. DNA product 469 bp was amplified. Amplification of the part of SMV CP gene was performed in 12.5 µl of Dream Taq PCR Master Mix (2x) buffer (containing Dream Taq DNA polymerase, 2X Dream Taq buffer, 0.4 mM of each dNTP and 4 mM of MgCl 2 ), 7.5 µl nuclease-free water, 1 µl of each primer (10 µM), and 3 µl of cDNA. The temperature regime for amplification reactions was as follows: initial denaturation for 3 min at 95 °C, followed by 35 cycles of 95 °C for 30 s, 55 °C for 30 s, and 72 °C for 30 s. The final extension was at 72 °C for 10 min. PCR products were separated on a 1.5% agarose gel with DNA markers MassRuler DNA Ladder Mix ready-to-use (SM 0403, Thermo Scientific, USA).
The PCR products were purified from the agarose gel using a QIAquick Gel Extraction Kit (Qiagen, Great Britain) following the manufacturer's instructions. Sequencing of the purified amplified DNA fragments carried out with the 3130 Genetic Analyzer (Applied Biosystems, USA).

Phylogenetic analysis
The CP gene sequences of the Ukrainian SMV isolate were compared with the SMV sequences in the NCBI database using the BLAST program. SMV isolates used in this study are presented in Table 1. Nucleotide and amino acid sequences were aligned using Clustul W in MEGA 7 [13]. Phylogenetic trees for the part of SMV coat protein gene were constructed by the maximum-likelihood method (ML) [14] using the best-fitting evolutionary models. To check the reliability of the constructed trees, the bootstrap test with 1000 bootstrap replications was used. Aligned CP amino acid sequences were visualized and compared using BioEdit sequence alignment editor.
Synonymous/nonsynonymous (dN/dS) mutation ratio calculations. To calculate the dN/dS ratio, an indicator of the evolutionary direction, the CP nucleotide sequences of all SMV isolates were codon-aligned. The ratio of the rate of nonsynonymous (dN) to the rate of synonymous (dS) mutations was calculated using the Nei-Gojoboori method in the SNAP program [15].

Results and Discussion
Phylogenetic analysis was performed for SMV isolated from soybean plants cv. Kordoba (Sumy region) named as SKS-18. The rate of seed transmission of SKS-18 was determined as 3.3% that was shown by us earlier [16]. Nucleotide (nt) and amino acid (aa) sequence, 430 nt of the CP gene region of the seedtransmitted SMV isolate SKS-18, localized at the genomic position 8640-9069, was compared with the sequences of 33 SMV isolates/ strains from GenBank (Tabl.1).
It has been established that the 430 nt region of the CP gene of SKS-18 has nucleotide sequence identity from 98.8% to 89.8%, that is from 5 to 44 nucleotide substitutions. According to the nt sequence of the studied region of the CP gene, the isolate SKS-18 has the highest percentage of identity (98.8%, 4 nt substitutions) with Iranian isolates Ar33 and Lo3, American isolate VA2, as well as Ukrainian isolate UA1Gr. SKS-18 has a high identity with other isolates studied in China -XFQ014 (98.6%) and HB-S19 (97.6%), Poland -M (98.6%), Iran -Go11 (98.4%) and in USA -the strain 1083 (97.9%), which are 6, 10, 6, 7 and 9 nucleotide substitutions, respectively ( Table 1).
The phylogenetic tree presented in the Fig.  1a is fully consistent with the data in Table 1 -the isolate SKS-18 is located in one cluster with isolates of the highest nucleotide identity: Ar33 and Lo3, VA2, UA1Gr, XFQ014, HB-S19, M, Go11 strain 1083, as well as strain C, the isolate SV-15. Unlike nucleotide, the vast majority of isolates (29 out of 33) are completely identical with each other by amino acid sequences. Only G7A, G7, G6H have 1 aa substitution, G7d and SKS-18 have two aa substitutions (Fig. 1b, Table 1).
Classification of strains/isolates of SMV is rather complicated. In the United States Cho and Goodman (1979) 98 SMV isolates are classified into seven strains, namely G1-G7. In addition to the difference in symptom severity, the SMV strains G1 through G7 also differ in the efficiency with which they are transmitted. The same differential system was also utilized in Korea, resulting in additional SMV strains such as G5H, G6H, and G7H identified. In Japan and China, however, different sets of soybean cultivars were used, and the isolates of SMV collected in these two countries were finally classified into five (A to E) and 21 (SC1 to SC21) strains, respectively. Later, Shigemori [25] and Kanematsu, Nakano [26] attempted to unify the classification of SMV strains from U.S. and Japan. The investigation by the artificial inoculation of U.S. differential varieties with the Japanese strains showed that the Japanese strains were classified into three groups: 1). containing A and B (corresponded to strain G3), 2). containing strains C and D, a b Fig. 1. Maximum likelihood (ML) of phylogenetic tree resulting sequences of 430 bp part of the CP gene of Ukrainian SMV isolates SKS-18 and isolates from other countries. Names and GenBank accession numbers are given in Table 2: a -nucleotide sequences, Jukes-Cantor model; b -amino acids sequences, p-distance model. The values at the nodes indicate the percentage of replicate trees in which associated taxa are clustered together (number of bootstrap trails: 1000 replicates). The scale bar shows the number of substitutions per base. and 3). containing only E. Strains C, D, and E corresponded to no U.S. strains [25]. Kanematsu and Nakano [26] artificially inoculated the Japanese differential varieties with the U.S. strains. The U.S. strains were also classified into three groups: 1) containing G1 and G4 (corresponded to strain B); 2) containing G2, G3, G6, and G7 (corresponded to strain A); 3) containing only G5 (corresponded to strain C), whereas strains D and E corresponded to no U.S. strains.
Ukrainian isolate SKS-18 was clustered into the one clade with Japanese strain C (Fig. 1a). According to Kanematsu and Nakano [26] classification, Ukrainian isolate SKS-18 belongs to G5-group. Among all taken to the study Japanese and US strains, SKS-18 has the highest nucleotide and amino acid identity with G5 strain and G5H -clone (Tabl.1).
Noteworthy, Ukrainian SMV isolate Pol-17 which was earlier studied by us had some other phylogenetic relationships [27]. It indicates the differences between these Ukrainian SMV isolates.
To explore the evolutionary forces acting on the SMV CP gene, the dN/dS values were calculated for all of the SMV CP sequences in our study (Tabl. 1). This ratio indicates the amount of nonsynonymous to synonymous mutations. dN/dS ratio for isolate SKS-18 compared to all other isolates was 0.0315, for the rest of isolates -from 0.0090 to 0.0219. This indicates a higher nucleotide diversity of the isolate SKS-18 compared to all selected in this study SMV isolates. The global dN/dS ratio for all sequences studied was 0.014 (p < 0.01). The value below 1 indicates that the SMV CP gene experiences a negative (purifying) selection pressure -selection to maintain the sustainability of the gene.
It was revealed 2 aa substitutions in the part of SKS-18 CP gene: Ser→Cys -at 1 st position and Lys→Ala -at 2 nd position (Fig. 2).
Only 71 aa from 143 are presented in Fig.2, because at positions 72-143 the sequences were identical for all SMV isolates. Amino acid substitutions were observed also for the isolates G6H, G7, G7A and G7d. It has been established that the aa substitutions in SKS-18 at positions 1 and 2 are unique in comparison with all SMV isolates taken for the analysis. Substitution Ser → Cys requires transition g → c (tcc → tgc or agc → tgc or simultaneous substitution of two nucleotides in the codons tcg, tca, tct, agt to form the tgc codon). The formation of the second cysteine codon (tgt) also requires two nucleotide substitutions in the serine codons. Lys → Ala substitution requires two nucleotide substitutions in the alanine codon gcg to form the lysine codon aag, or three nucleotide substitutions in all codons of alanine to form the lysine codon aaa. Such simultaneous substitutions of two or three nucleotides are of low probability, so the mechanisms of the identified substitutions of these amino acids are of interest for understanding the features of the SMV variability, as well as their role in the seed transmission of the virus, since only few single-aminoacid changes near the C-terminus of the CP of certain SMV strains led to the impossibi lity of seed transmission [3]. The P1, CP, and the DAG motif are also associated with seed transmission of potyviruses, which suggests that CP interactions with HC-Pro are important for multiple functions in the SMV infection cycle.
The results obtained by Domeir et al. [2] indicated that the most poorly seed-and aphidtransmitted SMV isolates G7 and G7F had two mutations and G5 one mutation in the DAG motif. However, the isolate G2 with a low seed transmission rate had no mutations is this motif but was characterized by aa substitution in other position (Q264 to P). Some potyviruses, e.g., the isolates of PSbMV, have no DAG triplets and are still transmitted efficiently by aphids and through seed. While HC-Pro and CP have been implicated in both aphid and seed transmission [2], different regions of the proteins may be involved in the two modes of transmission.

Conclusions
By the nucleotide sequences of CP gene region, the isolate SKS-18 has identity from 98.8% to 89.8%, that is from 5 to 44 nt substitutions. The highest percentage of identity (98.8%, 4 nt substitutions) is revealed with the Iranian iso- lates Ar33 and Lo3, American isolate VA2, and Ukrainian isolate UA1Gr. The isolate SKS-18 is localized in one cluster alongside the isolates with the highest nucleotide identity: Ar33, Lo3, VA2, UA1Gr, XFQ014, HB-S19, M, Go11, 1083, that may be due to similar variability. The dN/dS ratio below 1 testifies to the influence of negative selection pressure on the SMV CP gene. However, SKS-18 has a higher nucleotide diversity compared to all SMV isolates selected in this study.
By the amino acid sequences, unlike nucleotide, the vast majority of isolates (29 out of 33) are completely identical. It has been established that the aa substitutions in SKS-18 at positions 1 and 2 are unique in comparison with all SMV isolates taken for the analysis, because the simultaneous substitutions of two or three nucleotides, required for the amino acid replacement, have a very low probability. The mechanisms of such substitutions are of interest to understand the features of the SMV variability, as well as its role in the seed transmission of the virus. Additional phylogenetic studies of other SMV genes are required to identify the SMV genes involved in the seed transmission.