A role of expression level of reference and investigated genes in prostate tumors for qPCR analysis

G. V. Gerashchenko, E. O. Stakhovsky, L. I. Chashchina © 2018 G. V. Gerashchenko et al.; Published by the Institute of Molecular Biology and Genetics, NAS of Ukraine on behalf of Biopolymers and Cell. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited UDC 577.218+616.65


Introduction
A quantitative real-time PCR (qPCR) is a widely used method to assess the gene expression in a basic and clinical research [1][2][3].Relative quantification requires the use of a reference gene (or a few reference genes) for normalization of the gene expression.Usually, several housekeeping genes are used for this purpose [4].The main quality of the reference gene is the constitutive expression under various experimental conditions, and also in pathological processes and in specific tissues.
It is known that upon carcinogenesis the expression of many genes, including some housekeeping genes, altered.This creates problems when searching for the reference genes for qPCR normalization, as there are no reference genes universal for all types of tumors [5].Such genes must be validated, according to a tumor type and experimental conditions.Moreover, the features of their expression should also be considered.This is especially important for the low-expressed genes, which are often the subject of research, due to the peculiarity of their functions in physiological and pathological processes.
The validation of the reference genes for prostate tumors, lymph nodes from patients with prostate cancer and also prostate cancer cell lines resulted in the creation of a set of genes, namely TBP, HPRT1, ALAS1, TUBA1B, GAPDH and B2M that are expressed constitutively in prostate cancer and normal tissues, making them suitable for qPCR normalization [6][7][8][9].
In the present work, we used four reference genes (TBP, HPRT1, ALAS1, TUBA1B) in different combinations -from 1 to 4 genes, to compare the qPCR results after normalization.

Materials and Methods
Collection of prostate tissues.The samples of cancer tissues and conventional normal tissues (CNT, taken from the other prostate lobe outside of the tumor) were frozen in the liquid nitrogen immediately after surgical resection at the National Cancer Institute (Kyiv, Ukraine).Benign prostate tumors (prostate adenoma samples) were collected at the Institute of Urology (Kyiv, Ukraine) after radical prostatectomy and frozen as described above.All samples were collected in accordance with the Declaration of Helsinki and the guidelines, issued by an Ethic Committee of the Institute of Urology of National Academy of Medical Sciences of Ukraine and of the National Cancer Institute of National Academy of Sciences of Ukraine (NASU), and the Ethic Committee of the Institute of Molecular biology and genetics of NASU.Experimental studies were conducted, using 37 prostate adenocarcinoma samples of different Gleason scores and at various stages; 37 corresponding conventional normal tissue (CNT) samples; and 20 samples of adenomas [10,11].The tumor samples were characterized, according to the International System of Classification of Tumors, based on the tumor-node-metastasis (TNM) and the World Health Organization (WHO) criteria.The clinical characteristics of the tumors were described earlier [10,11].
Total RNA isolation and cDNA synthesis.50-70 mg of frozen prostate tissues were homogenized to a powder in liquid nitrogen.Total RNA was isolated, using TRI-reagent (Sigma-Aldrich, USA).The concentration of the isolated total RNA was assessed, using a spectrophotometer (NanoDrop Technologies Inc. USA).The quality of RNA was deter-mined by electrophoresis in a 1 % agarose gel by band intensity of 28S and 18S rRNA (28S/18S ratio). 1 µg of the total RNA was treated with RNase-free DNase I (Thermo Fisher Scientific, USA); cDNA was synthetized, using RevertAid H-Minus M-MuLV Reverse Transcriptase (Thermo Fisher Scientific, USA).
Statistical analysis.The Kolmogorov-Smirnov test was used to analyze the normality of distribution.The RE levels in prostate adenocarcinoma and paired CNT were compared, using the Wilcoxon Matched Pairs test.RE fold differences in 2 -ΔΔCt model were considered significant, when expression changed more or less, than 2 folds.The Fisher exact test was used to monitor differences between experimental groups.The differences between experimental groups (adenocarcinomas, CNT and adenomas) were determined by Kruskal-Wallis test with following tests for multiple comparisons.The Dunn-Bonferoni post-hoc test was performed to determine RE differences between pairs of prostate samples under multiple gene comparisons [13].The Benjamini-Hochberg procedure was used to adjust a false discovery rate (FDR) set at 0.10-0.25,when multiple comparisons were performed [14].

Results
RE of 23 genes, representing markers of cancer-associated fibroblasts (CAF) (the CAF gene group), tumor-associated macrophages (TAM) (the TAM gene group) and inflammation-associated genes (the INF gene group) have been determined.Genes were divided also by RE level into three groups: showing a high expression (Ct < 20 cycles), the moderate expression (Ct = 20-29 cycles) and the low expression (Ct > 29 cycles).
The reference genes ALAS1 and TUBA1B showed a high level of expression, whereas TBP and HPRT were expressed at a moderate level.TBP demonstrated the lowest expression level among the references.Only three genes (ACTA2, MSMB and HLA-G) out of 23 studied demonstrated high RE levels.10 genes were expressed at a moderate level and 10 -at low level of expression.
A theoretical calculation of a hypothetical deviation of the RE of reference genes expressed at high and low levels was developed, taking 0.5 Ct as a hypothetical error.RE of the studied genes was calculated, using the 2 -ΔCt method (Table 1).Our calculations showed that the RE deviation with an error of 0.5Ct for reference genes was the same (1.414) for all analyzed genes, regardless expression levels of the reference genes (Table 1).This data indicates the importance of the constitutive expression of the reference gene when comparing RE of the analysed and the reference genes.
The experimental data calculated, using the 2 -ΔΔCt model, showed statistical significant differences between the paired T/CNT in one reference group (17 out of 23 investigated genes) (Table 2).
A complete match of statistical data was observed for all three reference groups for 16 out of 23 genes.Eleven genes beyond 16 showed significant changes of RE in all three reference groups; 7 of these genes were expressed at high and moderate levels.Diver gences of RE were observed for 7 genes in 10 comparative groups, 6 of which showed low expression.Thus, the threshold value of matching differences for highly and moderately expressed genes was set at 25-30 % (10-11 samples out of 37), whereas for low expressed genes the value should be no less, than 35 % (more than 13 samples out of 37), to avoid possible expression deviations of the reference genes and minimize the influence of qPCR reaction inhibitors for PCR analysis of low-expressed genes.
RE values were investigated using the 2 -ΔCt method for three sets of the samples: A high similarity was found for all three reference groups with different types of group-ing of analyzed samples (> 82 % -TNA group, 69 % -Cancer stage group, 64.5 % -GL group).10out of 23 genes in the TNA sample groups showed significant differences in RE in 17 pairs (Table 3A).No similarity was observed for the 3 reference group normalization in 3 sample groups of TNA (17.65 %) with RE fold changes less than 1.7 times.
Another grouping type (by tumor stages) (Table 3B) demonstrated significant differences in RE for 14 genes in 45 pairs of sample groups.No similarity in the 3 reference group Notes: statistically significant differences between T/CNT, calculated, using the Fisher exact test with correction on multiple comparisons, FDR = 0.2 are shown in bold (black and red); in black (bold) -statistically significant differences, that have a complete match for all groups of reference genes; in red (bold) -divergences of statistical results between reference groups; & -significant differences with FDR = 0.2; green boxes -highly expressed genes; white boxes -moderately expressed genes; pink boxes -low expressed genes.

Discussion
Performed hypothetical calculations indicate that the expression of both, reference and analyzed genes does not influence the deviation (variation) in obtained RE, if the 2 -ΔCt method was used.This confirms the need for constitutive expression of reference genes in all analyzed samples [5,6].Some cautions concern the low expressed genes, for example, during PCR analysis the PCR inhibitors may increase.By PCR inhibitors we mean formed dimers of primers, non-specific products and loss in the activity of Tag-polymerase [15][16][17].All these factors inadvertently impact the efficiency of PCR, thus, resulting in erroneous RE levels.This, in turn, leads to difficulties in assessment of the low expressed genes, regardless of the optimization of qPCR conditions.Especially, this is important if the reference genes are expressed at low levels.So, the low expressed genes should not be chosen as the reference.
Other parameters that impact RE are the values of fold changes and a proportion of the samples where the expression of a certain gene changed significantly.High heterogeneity of gene expression in prostate cancer samples [18] makes this impact more complicated.Noteworthy, in the cases, when fold change is high, the expression levels of the reference do not influence the calculated values, as shown by our results and literature data [7,13].When we compared the changes lower than 2-fold or in a proportion of samples below 30 % of all studied, even if differences in RE were statistically significant, we could get both, false positive and false negative results, namely differences could appear where they are not present, groups overlapped, etc.This impact became more evident, when the low expressing genes were analysed, using both methods, the 2 -ΔCt and 2 -ΔΔCt .
The next important factor of the statistical analysis is the number of samples in a group [19].This is supported by the data presented in this article.For example, the largest number of samples in groups (20 to 37 grouped samples (TNA group)) produced the lowest proportion of inconsistences of statistical results for all reference groups.Additionally, this amount of samples in groups demonstrated the highest rate of matching results (82 %) and the lowest threshold of fold changes (1.7 times) to observe the statistically significant differences between the analysed groups for all of reference genes.
The type of grouping is no less important, than the number of samples in groups.Obviously, the gene expression pattern correlates with the clinical and pathological characteristics, thus providing the possibility to define the genes with altered expression at a given stage of the disease (HIF1A, CD68, CCL22, NOS2A1), or related to a specific GL score (HIF1A, CCL22, NOS2A, IL2RA1).Noteworthy, in the TNA group, that contained tissues, collected at the different stages of disease or tumors attributed with various GL score, the expression changes were nullified, due to a high RE deviation.

Conclusions
All three types of reference genes can be used for normalization of RE for prostate tumor samples.The differences in the expression levels of investigated and reference genes have no impact regardless usage of the 2 -ΔCt and 2 -ΔΔCt models; the constitutive expression of reference genes is the important parameter.Thus, the values of expression of the analysed genes, as well as RE value changes, the number of samples in groups and high heterogeneity of gene expression are important parameters for choosing the threshold level differences between the groups of samples for reliable data interpretation.

Table 1 . Calculation of changes in RE of investigated (Inv) and reference (Ref) genes, expressed at different levels (high (h), moderate (m) and low (l)), when the hypothetical error was 0.5 Ct (e).
Notes: Ref h -high expression of the reference gene, Ref he -Ref h with 0.5 Ct error, Ref m -moderate expression of the reference gene, Ref me -Ref m with 0.5 Ct error, Ref l -low expression of the reference gene, Ref le -Ref with 0.5 Ct error.