Distribution and expression of chicken endogenous retroviruses in the host genome

The distribution of seven groups of endogenous retroviruses (ALV-related ev loci, HERV-I-related proviruses, EAV-HP, EAV-0, E-33, E-51, ART-CH) in the genome of domestic chicken (Gallus gallus) was studied according to the composition (long-range variation of GC-content) of the host genome. GC-rich proviruses (ev loci, ART-CH and EAV-0) have been localized mainly in GC-richest isochore families H2, H3 and H4, GC-poor HERV-I-related proviruses, E-51 and E-33 localized in GC-poor isochore families LI, L2 and also in HI (isopycnicity). GC-rich EAV-HP, except expected distribution GC-rich isochores, present also in GC-poor compartments. Investigation of expression by RT-PCR and analysis of EST databases provided a tissue-specific patterns that could change the picture of proviral distribution due to reintegration. Reasons of endogenous retrovirus isopycnicity are likely to be the compositional match between the integrant and the chromosomal region of the host, that led to the stability of integration.

Introduction.Retroviruses are known to integrate as proviruses into the host-cell genome.It is a critical step in the life-cycle of retroviruses, and their repli cation is not possible without integration [1].Endo genous retroviruses have integrated into the genome of a germ cell, and they inherit along with the rest of the host genome as Mendelian genes.
A widely accepted view that retroviral integration into the host genome occurs randomly [2] was challenged with the development of methods for the compositional fractionation of vertebrate DNA [3].This approach led to the discovery of isochores, long (> 300 kb), compositionally homogenous DNA seg ments [4] and showed that, in the case of stable integration, proviruses of exogenous mammalian ret roviruses (bovine leukemia virus, BLV; Rous sarcoma virus, RSV; human T-leukemia virus type I, HTLV-I; human immunodeficienty virus type 1, HIV-1) as well as one endogenous retrovirus (mouse mammary tumor virus, MMTV) located in some regions of the host genome (compartments) and in host genome se quences compositionally matching to the viral se quences (isopycnicity) (reviewed in [5]).
In the case of primary infection, as it has been shown by a mapping of HIV-1 integration sites [6], retroviruses integrate towards gene-rich compart ments which is characterized by an open chromatin structure [4].
There is a correlation between the isopycnic localization of provirus and its transcription [7,8 ].
Our study on specificity of integration of avian endogenous retroviruses is interesting for several reasons.First of all, compositional organization of the avian genome is different from that of the mammalian one: the isochore pattern of avian genome is charac terized by GC-richest isochore H4 [4,9].Moreover, endogenous retroviruses exist with the host genome for a long period of time that affect their stucture and, probably, localization in the genome.It is shown that «ancient» endogenous MMTV localized isopycnically while exogenous sequences and recently acquired endogenous MMTV showed a broader distribution [10].Similar results has been obtained for Alu repeats: the older Alus, the stronger bias of their localization towards GC-rich DNA (isopycnicity) [11].
Three families of chicken (Gallus gallus) endo genous retroviruses, different in structure and evolu tional age, has been described to date: 1) ALV (avian leukosis virus) -related ev loci; 2) EAV family with subfamilies EAV-HP, EAV-0, E-51, E-33, ART-CH and 3) HERV-I (human endogenous retrovirus type I) -related retroviruses (reviewed in [12,13]).The EAV family is restricted to all Gallus species while evs are specific for domestic chicken and its wild relative red jungle fowl only and therefore is younger than EAV [14].
HERV-I-related retroviruses are known for se veral classes of vertebrates and are the oldest family of chicken endogenous retroviruses [15].
In the present work we studied a specificity of mentioned above endogenous retrovirus integration in the light of their evolutional age and expression.
Materials and Methods.DNA, RNA isolation and compositional fractionation.DNA was isolated from an adult liver and blood (chicken line CB (B12/B12)) after overnight digestion at 55 °С in 500 /Л of extraction buffer (10 mM Tris, 400 mM NaCl, 2 mM EDTA, 2 % SDS, Proteinase K) and subsequent extraction by chloroform and precipitation by ethanol.The average size of DNA was 50 kb as determinated by electrophoresis.
Compositional fractionation of DNA by prepara tive centrifugation in CsCl density gradient and analytical centrifugation were carried out as pre viously described [16,17].
Total RNA was prepared from 18-day old emb ryo fibroblasts and v-src induced tumors of chicken line CB (B12/B12) and examined by electrophoresis on formaldehyde-agarose gels.
Probes and hybridization.Determination of provi ruses in DNA fractions was studied by hybridizations with three types of probes: 1) long probes (from 100 bp) known from literature data or obtained using PCR primers designed by program РгітегЗ (wwwgenome.wi.mit.edu/cgi-bin/primer); 2) oligonucleoti de probes designed by РгітегЗ; 3) genome of Rous sarcoma virus without src gene and 3 , LTR [18 ] (table 1).
PCR conditions for primers ART.l, ART.2, PRO, JO, E51.1, E51.2 and НЗ, H4 have been previously described in papers referenced in table 1. PCR for primers designed by program РгітегЗ was carried out in a 50 fi\ volume containing 300 ng of genomic DNA, 200 /Ш dNTP, 0,2 mM of each primer and 2 units of Taq polymerase («Roche», France).DNA was de natured at 94 °С for 3 min and subjected to 30 cycles consisting of 1 min at 94 °С, 40 s at 56 °С, 40 s at 72 °С with the final extension of 7 min at 72 °С PCR products were cloned using TA-cloning kit («Promega», USA) and sequenced.
The hybridization, signals were evaluated with a Phosphorlmager.The Gaussian curves of proviral distribution were obtained using a program IgorPro (from Wave Metrics Inc., USA).
To verify RT-PCR products, Southern blot hybri dization (with the same probes amplified from geno mic DNA) in stringent conditions has been perfor med.In the case of ev-1, probe evgag-pol.lP(taacgcaattagtggaaaaagaat) [22 ] were used.

Retrovirus
Results and Discussion.Compositional analysis of chicken endogenous retroviruses.Compositionally retroviruses belong in two classes: GC-poor class and GC-rich class [27,28 ].As it is shown in table 2, four chicken endogenous retroviruses -ev loci, ART-CH, EAV-HP and EAV-0 are GC-rich and therefore belong to the first class.Other three groups, thought the whole genomic structure is not identified, appear to be GC-poor.
Interestingly, GC-level of gag, pol and env genes are close to those of a whole genome.In contrast, LTRs are GC-poor even in GC-rich retroviruses (except EAV-HP); this fact is known for avian retroviruses only [27].Therefore low GC-level of E-33 LTR may not indicate that E-33 belong to GC-poor class.However, since it demonstrates high percent similarity to E-51 [13], we can assume that E-33 is GC-poor.
E-51 and HERV-I-related retroviruses have GCpoor env and pol genes correspondingly.The later has been estimated on the basis of sequencing of HERV-I-related fragment from CB (B12/B12) chicken geno me [29 ].GC-poorness of this group was confirmed by compositional analysis of HERV-I present in human endogenous retrovirus database -http://herv.im-BOR1SENK.O L. С, RYNDITCH A. V., BERNARD I G.

N1 -not identified; 'GC-content of sequence between LTRs which contains part of gag; Sequences available from BBSRC EST clones.
g.cas.cz/[30 ] and HERV-I-related retroviruses sequ enced from genomes of other vertebrates [15].Their GC-content range from 42 to 45 % GC.
The localization of chicken endogenous retro viruses in the host genome, e v loci.Using PCR assay [31 ] we have found out that genomic DNA of CB (B12/B12) chicken contain three ALV-related proviruses (ev loci): ev-1, ev-7 and ev-10 [32].The probe for their detection on compositional DNA fractions (RSV genome without src gene and 3 , LTR) is capable to hybridize with all of them.Two different hybridizations showed almost similar results: ev loci were centered in GC-rich fractions with peak at 55 % GC and 57 % GC (fig. 1) that corresponds to the border between isochore families H3 and H4.It matches very well high GC-level of ALV-related proviruses (table 2) and indicates isopycnic loca lization.
HER V-I-r elated retroviruses.GC-poor proviral sequences were centered at 42 % GC and 49 % GC (fig.2).Interestingly, the first peak exactly matches GC-level of hybridization probe (42.3 % GC) which represents parts of pro and pol genes.Peak corresponds to GC-poor isochore L2 and to the H2.It is possible that the second peak is due to the retroviral reintegration.
E-33.We used two probes with different spe cificity to detect E-33 LTR in DNA from liver because E-33 genes has not been sequenced yet.
Probes E33LTR and E33.1-E33.2reveal three peaks: at 39 % GC, 47 % GC and 55-56 % GC (fig.3).Since probes are not specific to E-33 and could detect also ART-CH and E-51 (table 1), it is possible to assume that 47 % GC peak corresponds to E-51 and 55-56 % GC peak -to ART-CH.Therefore GC-poor peak at 39 % GC, which belong to the border of isochores LI and L2, is a real place of E-33 localization.
It should be mentioned that such shift could be due to the different specificity of hybridization probes used in both cases: probe used to determine EAV-HP sequences in DNA from blood cells could also reveal E-51 sequences.
Expression of chicken endogenous retroviruses.RT-PCR and subsequent hybridization demonstrated expression of five chicken endogenous retroviruses (E-51, E-33, EAV-0, EAV-HP, HERV-I-related) in embryonic fibroblasts and v-src induced tumors (fig.5).The sizes of hybridization bands were identical in both cases.We did not find out expression of ev-1 and ART-CH in these tissues.Others ALV-related retroviruses present in the genome of CB (B12/В12) chicken -ev-7 and ev-10 were not studied for exp ression since their sequences are not known.

Fig. 3. Hybridization of E-33 with probes E33LTR (A) and E33.1-E33.2 (B) on the chicken DNA from liver
Expression in other tissues was studied by analy sis of EST databases.In total, proviral sequences were found in 21 tissues and cell types (table 3).HERV-I-related proviruses and E-33 seem not exp ress in tissues present in databases.However exp ression of E-33 remain to be obscure, because only LTR (the only sequenced part) has been used for searching EST clones.Retroviral transcripts were not found in muscle tissue and macrophage cells.
Both in liver and blood cells, which we used for localization of proviruses, expression was observed.In liver, EAV-HP, HERV-I-related proviruses, E-33 and E-51 were not express.In blood cells, such as B-cells, T-cells, intestinal lymphocytes and in lymphoid tissue EAV-0, E-33 and HERV-I-related sequences were not found.
Besides tissue-specificity of expression, the pre sentation of different retroviral genes in EST data bases is non-uniform, ev loci expressed mainly in pancreas and heart; truncated poZ-transcripts are present in a minor quantity.ART-CH transcripts include the whole length of retrotransposon; they were found in adult cerebrum and 16-day embryo brain thought brain expression has not been iden-

Fig. 4. Hybridization of provirus EAV-HP with probes HPLTR C4), HP.up-HP.down (B) and H3-H4 (C) on the chicken DNA from liver (A, B) and blood (C)
tified before [19].The majority of EAV-0 transcripts were env-specific; there were also po/-sequences, which had not been found in other retroviruses, and gag-sequences not known before.EAV-HP expressed mainly in pancreas as e/tv-specific transcripts which have been observed more frequent then gag-specific transcripts.
The localization of seven groups of chicken endo genous retroviruses in compositional fractions as well as correlation between the isochore localization and their transcription have been studied.
The proviral distribution seems to be compart mentalized and isopycnic: GC-rich ev loci, ART-CH and EAV-0 have been found mainly in GC-richest isochores H2, H3 and H4 approximately matching the viral genome in base composition.GC-poor HERV-Irelated proviruses, E-51 and probably E-33 localized in GC-poor isochore families LI, L2 and in lowest GC-rich HI.GC-rich EAV-HP, except expected peak in GC-rich isochores, has also peaks in GC-poor compartments.
The main thing we can conclude from study of expression of chicken endogenous retroviruses is that expression is tissue-specific.For instance, EAV-HP, unlike other proviruses, has high level of expression in pancreas but was not expressed in liver.ART-CH and ev-1 transcribes in almost all tissues studied except embryonal fibroblasts.In contrast, embryo fibroblasts is the only cell type where the expression of HERV-I-related proviruses have been found.
Different level of expression may affect the picture of proviral localization, since after trans cription retrovirus may integrate in new sites of the host genome.Deleted proviruses with inactive pol gene use reverse transcriptase of a helper virus [19,33 ].Analysis of EST databases showed that conside rable amount of po/-transcripts belongs to EAV-0, which, along with structurally complete ev loci, may be the main provider of reverse transcriptase in the chicken genome.Indeed, as it has been demonstrated by Weissnahr et al. [34], EAV-0 is able to produce virus-like particles with an active reverse transcri ptase.Taking into account tissue-specific expression and possibility of reintegration, it is not surprisingly that picture of proviral localization may be different in different tissues even for one retrovirus.
An obvious possibility for compartmentilized, isopycnic integration of retroviral sequences, which lack oncogenes, is that this is the result of selection for certain integration sites, the activation of an oncogene providing a replicative advantage to the infected cell [5].However, since endogenous retroviruses, unlike exogenous ones, do not activate oncogenes and appear do not have any function at all (see [13] for review), the host cell cannot obtain any advantages from their integration.In this case, reasons of compartmentalization can be comparable to that of interspersed repeats, another permanent component of the genome.Olofson and Bernardi [35] demonstrated that the base composition of CR1 (48 % GC), which is an ancient class of non-LTR retrotransposons from the chicken genome, matches that of isochore HI, mainly harboring it.As in the case of mammalian Alus, LINES and chicken CR1, factors of endogenous retrovirus isopycnicity are likely to be the composi tional match between the integrant and the chro mosomal region of the host, and the degree of interference of the integrated sequences with the function of neighboring genes [28 ].
Another question then concerns the reasons of EAV-HP non-isopycnic localization.In general, pro viral localization not in isopycnic chromosomal envi ronment, known for exogenous retroviruses activating oncogenes, suggests that selection for a replicative advantage can override isopycnic integration.Exam ples of such integration has been found in the case of MMTV activating Wnt/int oncogene (reviewed in [5]).
Preferential expression of EAV-HP in pancreas resembles tissue-specific, hormone-dependent and developmentally regulated manner known for MMTV [10].In addition, GC-poor isochores, where EAV-HP was also found, contain more tissue-specific and developmentally-regulated genes then the GC-rich ones [4].
Thus EAV-HP may have an unknown function associated with genes located in GC-poor isochore families.It is known that some human endogenous retroviruses have developmental functions [36 ].
We can partially confirm the finding known for MMTV and Alus: «the older retroelement, the stron-

Table 3 Expression of chicken endogenous retroviruses in different tissues (on the basis of EST databases analysis)
Tissues 1-15 are from BBSRC database; 16-19 are from University of Delaware database; 20 is from GeneBank (accession numbers: CD734212; CD737734; CD739800); 21 is from Heinrich-Pette-Institute (Germany); 2 Lymphoid tissue contains mixture of thymus, bursa, spleen, peripheral blood lymphocytes and bone marrow; 3 pituitary tissue contains mixture of pituitary gland, hypothalamus and pineal gland; + -retroviral sequences were found.
Thus the results present here shows three va riants of chicken endogenous retrovirus localization: 1) GC-rich proviruses (ev loci, ART-CH, EAV-0) localized in GC-rich isochores; 2) GC-poor proviruses (HERV-I-related, E-51, E-33) localized in GC-poor isochores; 3) GC-rich EAV-HP localize both in GCrich and GC-poor isochores.These findings suggest the stability of integration in compositionally mat ching environment or, in the case of EAV-HP, may be due to the interference with neighboring genes.
Acknowledgements.This research was supported by a NATO grant (LST.CLG9/6685).We thank M. Costantini, G. Bucciarelli, G. Wronka and L. Tsyba for assistance with the preparation of gradients, Dr J. Heinar and V. Stepanets for providing RNA and DNA.

*
/g tti g/t ti ga t/c асі ggi g/t с JO ati agi a g/t a/g tc a/g tci ac a/Probes were designated by the names of PCR primers used for their preparation.