Transcriptome-based identification of PDGFA as a candidate secreted biomarker for hepatocellular carcinoma

M. S. Chesnokov, O. M. Krivtsova, P. A. Skovorodnikova © 2016 M. S. Chesnokov et al.; Published by the Institute of Molecular Biology and Genetics, NAS of Ukraine on behalf of Biopolymers and Cell. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited UDC 57.032


Introduction
Hepatocellular carcinoma (HCC) is the most common form of malignant liver tumors with extremely high aggressiveness and poor prognosis. HCC ranks the second place in cancer-related mortality rates while most HCC patients are diagnosed at advanced stages when the existing therapeutic approaches become inefficient [1,2].
The major difficulty in improving HCC diagnosis and treatment is imposed by a high heterogeneity of the genetic and signaling aberrations observed in HCC and poor understanding of molecular mecha-nisms underlying its development. Thus, the identification of new biomarkers suitable for an early diagnosis and potential therapeutic targets is an important field in improving the efficiency of HCC ma nagement [3].
Alpha-fetoprotein (AFP), the only HCC marker approved for clinical practice, has low sensitivity for early tumor detection [4,5]. Among additional HCC biomarkers under investigation, glypican-3 (GPC3) is the most promising one that demonstrates high sensitivity and specificity in tumor tissue but performs worse when detected in blood serum. The efficiency of HCC diagnosis can be improved by using

Genomics, Transcriptomics and Proteomics
combinations of biomarkers but remains insufficient to confidently detect HCC at early stages [5,6].
The next-generation sequencing (NGS) approaches open new possibilities in disclosing the molecular basis of carcinogenesis. The genomic and transcriptomic data revealed the multiple tumor-specific mutations and gene expression changes that can be further analyzed to identify putative biomarkers and changes in the signaling pathways regulating HCC progression. The present work is devoted to identification of novel prospective HCC biomarkers based on the results of transcriptome sequencing and investigation of their potential impact using experimental and bioinformatic approaches.

Materials and Methods
Samples collection, RNA extraction, transcriptome sequencing and differential expression analysis 19 pairs of tumor and adjacent non-tumorous (NT) liver tissues were collected after tumor resection from the patients with histologically verified HCC not associated with hepatitis virus infection. The samples were collected with informed consent, conforming to the ethical guidelines of the 1975 Declaration of Helsinki, frozen in liquid nitrogen and stored at -80 °C. The clinicopathological data on collected cases are presented in Table 1.
Total RNA was isolated as previously described [7]. Illumina HiSeq2000 100 nt pair-end transcriptome sequencing was performed for 5 pairs of tumor and liver tissue in two biological replicates. Library preparation, transcriptome sequencing, read processing and differential expression analysis were performed as previously described [7].

Quantitative Real-Time PCR
Total RNA was reverse transcribed using random hexanucleotide primers and MMLV reverse transcriptase (Promega, USA).
Real-time RT-qPCR was carried out using SYBR Green I PCR kit (Syntol, Russian Federation) and iQ5 Multicolor Real-Time PCR Detection System (Bio-Rad Laboratories, USA). TATA-binding pro-tein gene (TBP) was used as a reference gene. 45 cycles of amplification (30 s at 95 °C, 30 s at annealing temperature (PDGFA -66.0 °C, GPC3 -67.7 °C, TBP -62.8 °C), 30s at 72 °C) were performed and the reaction specificity was checked afterwards by a melt curve analysis. The gene expression levels were estimated using a standard curve for fixed signal value. For each sample, the gene expression level was normalized to TBP expression, logarithm to base 2 was taken from normalized value and difference between the values obtained for HCC and corresponding NT samples was calculated.

qPCR OLIGONUCLEOTIDES
Primer sequences E
TCGA Liver Hepatocellular Carcinoma (TCGA-LIHC) set that comprised information on the normalized gene expression in 51 matched liver and tumor

Statistical analysis
Each tissue sample used for RT-qPCR was analyzed in at least four technical replicates and a mean value was used for further analysis. The statistical analysis of results and plotting of graphs were performed using Origin Pro 2016 software (OriginLab Corporation, USA). The differences between gene expression levels in HCC and NT samples were estimated using a paired sample sign test (for paired samples) and Mann-Whitney U-test (for unpaired samples). The empirical distribution curves for gene expression in different datasets were compared using Kolmogorov-Smirnov test. The hierarchical cluster analysis of gene expression datasets was performed using Euclidean distance and Complete linkage algorithm. Receiver operating characteristic (ROC) curve discriminative power analysis was performed using the normalized gene expression level for classifier, HCC samples for estimation of true positive rate and NT samples from the same patients for estimation of false positive rate. The combined PDGFA+GPC3 classifier for ROC curves was generated by applying a logistic regression model to the data on both PDGFA and GPC3 expression levels and taking the values of expected probabilities as a new classifier. The correlations were evaluated using Spearman's rank correlation test. A survival analysis was performed using Kaplan-Meyer test with log-rank significance estimation algorithm. Statistical significance was accepted with p<0.05.

Results
The whole transcriptome data analysis using DESeq [16] revealed 83 differentially expressed (DE) genes that were up-regulated more than 5-fold in all HCC samples as compared to corresponding adjacent liver tissue. In order to identify putative secreted HCC markers, the FASTA sequences of all mRNA isoforms of DE genes were analyzed with SignalP Server 4.1 (http://www.cbs.dtu.dk/services/SignalP/) and TMHMM Server v. 2.0 (http:// www.cbs.dtu.dk/services/TMHMM/) using default settings. The genes harboring sequences that were predicted to encode the signal peptide cleavage sites but not transmembrane helices were examined using GeneCards (http://www.genecards.org/), The Human Protein Atlas (http://www.proteinatlas.org/) and the information from journal articles. Then we excluded the genes that were valuably expressed in normal liver (FPKM>1) and/or demonstrated relatively low level of expression in tumors (FPKM<2). Thus, we obtained a list comprising 9 potential secreted HCC markers ( Table 3).
The list of candidate serum markers includes growth factor PDGFA, a component of PDGF signaling pathway identified in our recently published HCC case report as a potential druggable target [7]. Since the proangiogenic and mitogenic stimulation promoted by PDGF signaling can be blocked by a multikinase inhibitor sorafenib, the only FDA approved drug for HCC treatment [17,18], we have focused on investigation of the expression alterations of PDGFA that might be not only a candidate HCC marker but also a prospective target for drug treatment.
To explore HCC-specific changes in PDGFA expression we performed RT-qPCR analysis of PDGFA expression levels in 19 pairs of tumor and NT tissues from hepatitis-negative HCC patients. While low PDGFA expression levels were detected in all NT specimens, the PDGFA expression in HCC tissue was upregulated more than two-fold in 17 of 19 (89.5 %) examined cases (Fig. 1A) and the difference between these two subsets was statistically significant (Fig. 1B).
We investigated the potential of PDGFA usage as a biomarker by comparing its expression changes in HCC tissue to the expression changes of GPC3, the latter being a promising candidate biomarker for HCC [19]. RT-qPCR analysis revealed the significant GPC3 overexpression in HCC compared to NT tissue (Fig. 1D) in 18 of 19 (94.7 %) cases (Fig. 1C). Spearman's correlation analysis demonstrated that the changes in PDGFA expression were not associated with the clinicopathological properties of examined tumors.
In order to determine whether the PDGFA up-regulation discovered in the examined sample set is a frequent event in HCC we performed meta-analysis of the gene expression data for paired HCC/NT samples obtained from six publicly available datasets ( Table 4). Each of the analyzed datasets displayed a significant (more than 2-fold) up-regulation of the PDGFA transcription in tumor tissue compared to the corresponding surrounding liver samples in no less than 50 % of cases. Since several datasets comprised the data on a low number of samples, we further analyzed TCGA ("TCGA set") and GSE14520 ("Roessler set") datasets.
The ratios of cases with significant PDGFA upregulation ranged from 52.4 % ("Roessler set") to 63.8 % ("TCGA set") ( Fig. 2A). Median values of normalized PDGFA expression level were signifi-cantly higher in cohorts of HCC tissue samples than in cohorts of corresponding NT specimens (Fig. 2B).
While both datasets support the observation of PDGFA up-regulation being a frequent event in HCC tissue, the percentage of PDGFA overexpressing samples is less than observed in our experimental set. To explore whether this difference could be associated with hepatitis infection we subtracted a fraction of 94 TCGA cases that were not marked as hepatitis-positive (hereinafter called "TCGA-HN set" for "hepatitis-negative"). The proportion of cases with up-regulated PDGFA expression in "TCGA-HN set" (68.1 %) ( Fig. 2A) was very similar to the one observed in full "TCGA set", while no statistically significant differences between full and "HN" sets in the context of PDGFA expression level median values (p=0.558 estimated by Mann-Whitney U-test) and the empirical distribution curve (p=0.988 estimated by Kolmogorov-Smirnov test) were found thus indicating that PDGFA up-regulation in HCC occurs irrespectively of tumor etiology.
To evaluate the PDGFA potential sensitivity as a HCC biomarker we compared the PDGFA expression changes in "Roessler set" and "TCGA-HN set" to the alterations of GPC3 expression (Fig. 3A). While the sensitivity of PDGFA (52.4 % for "Roessler set", 68.1 % for "TCGA-HN set") was lower than that of GPC3 (87.4 % for "Roessler set", 81.9 % for "TCGA-HN set"), the combination of PDGFA and GPC3 increased the sensitivity to 93.9 % for "Roessler set" and 93.6 % for "TCGA-HN set" (p=0.024 for both sets compared to GPC3 alone, estimated by Fisher's exact test). In order to evaluate the possibility of using the expression level of PDGFA, GPC3 or both genes as a parameter discerning HCC from NT tissue, we generated ROC curves using the data for paired samples from "Roessler set" (n=231) and "TCGA-HN set" (n=24). Usage of the PDGFA+GPC3 combination increased the value of area under a curve (AUC) in comparison to PDGFA or GPC3 alone, thus indicating a stronger discriminative power of the PDGFA and GPC3 combination (Fig. 3B).
A correlation analysis of the PDGFA expression changes and clinicopathological characteristics available for "TCGA-HN set" revealed a reverse correlation of the PDGFA up-regulation with the extent of tumor invasion into blood vessels.
Since "Roessler set" contains the data on Barcelona Clinic Liver Cancer (BCLC) staging which is widely used for evaluation of prognosis and treatment algorithm for HCC patients [20] we analyzed an association between the PDGFA up-regulation and overall and progression-free survival of patients belonging to different BCLC groups. The PDGFA overexpression in tumor tissue was associated with better overall survival of patients with early BCLC-0 and BCLC-A HCC stages but not intermediate BCLC-B or late BCLC-C stage (Fig. 4). No associations between the PDGFA overexpression and progression-free survival were found (data not shown).

Discussion
The discovery of tumor biomarkers significantly improved the outcome for cancer patients and opened new possibilities for early diagnosis and targeted treatment of malignant tumors [21]. The only serum HCC biomarker approved for clinical practice is AFP [4] that displays 59% sensitivity and 90% specificity [22]. Since AFP exhibits insufficient sensitivity for the confident HCC diagnosis, the additional markers to complement AFP and improve HCC diagnostic accuracy are under investigation [23]. Currently GPC3 is considered to be one of the most promising HCC candidate biomarkers. It can be detected at the mRNA level in liver tissue or at the protein level in serum or liver tissue. Immuno histoche mical detection of GPC3 demonstrates a high sensitivity for poorly-differentiated HCC but a lower sensitivity for highly-differentiated and fibrolamellar variants [19]. The GPC3 mRNA was found to be overexpressed in more than 80 % of HCC cases associated with viral hepatitis and in 76 % of non-viral HCC cases [24]. However, the measuring of serum GPC3 level was less sensitive (55.2 %) while retaining a high specificity (84.2 %). The GPC3 combination with AFP was uncovered to be more effective for HCC diagnosis with 75.7 % sensitivity and 83.3 % specificity [25]. Thus, we have chosen GPC3 as a "reference" HCC biomarker and compared the data obtained for PDGFA to the GPC3 performance.
Performed analysis of the expression data from our HCC set and publicly available databases revealed the frequent PDGFA overexpression in HCC tissue. Though PDGFA was previously reported to be overexpressed in HCC [26], no detailed investigation on its expression alteration or its potential as a HCC biomarker has been published to date.
A high rate of PDGFA up-regulation in 19 examined hepatitis-negative HCC cases was comparable to that of GPC3. However, PDGFA did not perform so well in larger and less homogenous datasets exhibiting a lower sensitivity than GPC3. While most cases from publicly available datasets demonstrated upregulation of both PDGFA and GPC3, there were subsets with mutually exclusive overexpression of PDGFA or GPC3 indicating that their combination could perform better than each biomarker separately. Indeed, if 2-fold increase in the expression level of either PDGFA or GPC3 was taken as cut-off, the sensitivity of HCC detection considerably increased up to 93.6%. The analysis of biomarkers discriminatory power revealed that PDGFA and GPC3, when combined, distinguished HCC from NT tissue of the same patients better than PDGFA or GPC3 individually. PDGFA, a secretable protein detectable in patient's serum, may be considered as a potential HCC diagnostic marker at the mRNA or protein levels especially when used in combination with GPC3 to significantly improve its low sensitivity. The association of PDGFA up-regulation with better overall survival of the patients with BCLC-0 and BCLC-A early HCC stages and a weaker invasion of tumor cells into blood vessels demonstrates that it can be accounted as a prognostic factor. However, this putative prognostic impact is limited since it is not observed in the groups with BCLC stages B and C. Hence, the PDGFA upregulation may be considered as a factor of favorable prognosis but the validation of this hypothesis requires further studies of larger patient cohorts.

Conclusion
The present study demonstrates that PDGFA is frequently overexpressed in HCC tissue. The combination of PDGFA and GPC3 performs well in distinguishing HCC and NT tissue when detected at the mRNA level. PDGFA up-regulation might have a prognostic potential for the patients with early HCC stages. We suggest that PDGFA may be a promising HCC diagnostic biomarker. Further studies focused on the detection of PDGFA in tumor tissue and serum of the HCC patients are necessary to define its efficiency (either alone or in combination with other biomarkers) and the validity for improving sensitivity of the early HCC stages detection.