Bioinformatics analysis of cis-regulatory elements in Mbl 1 and Mbl 2 genes in Rattus norvegicus

Aim. To identify and characterize with the help of bioinformatics the transcription factors binding sites in promoters of Mbl1 and Mbl2 genes, encoding mannose binding lectins in Rattus norvegicus. Methods. Bioinformatics, MatInspector software. The position weight matrices of transcription factor binding sites were obtained from the Matrix Family Library Version 9.0. Within the frame of the program we selected the binding sites, the cognate transcription factors of which are specifi cally expressed in the liver and immune cells, and have passed the fi lter for conservation after comparison with the binding sites in orthologous genes in Mus musculus. Results. The promoters of both genes share the binding sites for the members of the four common families of transcription factors (HNF, homeodomain transcription factors, GRs and ETS factors). The promoters of Mbl1 and Mbl2 gene possess correspondingly additional binding sites for the members of six (AP1 related factors, Ccaat/Enhancer Binding Proteins, FOX, p53, NFAT, ISGF3) and four (cAMP-responsive element binding proteins, heat shock factors, Nf-κB/c-rel and TATA binding protein) families of transcription factors. The Mbl1 specifi c transcription factors are mainly involved in the regulation of differentiation, development, metabolic homeostasis, organogenesis and cell cycle. Unlike them the Mbl2 specifi c transcription factors are more prone to mediate a stress-response. Conclusion. The variety of transcription factors potentially involved in regulation of the Mbl1 and Mbl2 transcription argue for these genes involvement in various cellular processes with specifi c role of each gene. The obtained results provide the basis for the task-oriented wet-lab bench experiments on their regulation.


Introduction
Mannose-binding lectin (Mbl) is a recognition molecule in the lectin pathway of a complement cascade.It binds directly to carbohydrate patterns on the surface of bacterial, fungal, parasitic cells, certain viruses and apoptotic, senescent, injured and transformed host cells and maintains tolerance to selfantigens of healthy host cells [1][2][3][4].The bound Mbl attracts Mbl-associated serine proteases (MASPs) and initiates the lectin pathway of a complement system [5,6].There are three pathways of a comple-ment cascade -lectin, classical and alternative.Their initial steps are somewhat different but all three of them converge on C3 convertase, C4b2b.The subsequent complement cascade results in eventual opsonisation of particles and their engulfment by phagocytes, lysis of pathogens, attraction of phagocytic cells through chemotaxis, release of infl ammatory peptides and cytokines [4,[7][8][9].Lectin pathway unlike other complement pathways is crucial for the prompt answer for pathogen invaders and altered host cells till the acquired (adaptive) immune system gains momentum.
ISSN 0233-7657 Biopolymers and Cell. 2015. Vol. 31.N 1.P. [63][64][65][66][67][68][69][70][71] doi: http://dx.doi.org/10.7124/bc.0008CE The largest amount of Mbl is produced in the liver and secreted to the circulation.Underproduction of Mbl increases the susceptibility to the infection but increases the resistance to malaria whereas overproduction is injurious during sepsis, shock, ischemia/reperfusion and rejection of trans planted solid organs, etc. [8,[10][11][12].The genetic differences in both structural and regulatory parts of the human gene defi ne the altered response of Mbl to stress [13].Thereby Mbl is considered as a perspective target in critical care and other fi elds of medicine.Several drugs are already designed and implemented in clinical practice [14].
In our previous study we have predicted the presence of interferon stimulated response element (ISRE) in the promoter of Mbl1 gene in Rattus norvegicus and hence the Mbl1 gene as a previously unknown gene of potential response to IFNα.Shor tly the search was conducted in a rat genome by the conser-vation-aided transcription factor binding sites fi nder (http://biomed.org.ua/COT RASIF/about.html)and the obtained genes were classifi ed for experimentally confi rmed and not confi rmed IFNα targets [15][16][17].In accord with this prediction the Mbl insufficiency is associated with the resistance of patients with hepatitis C to the IFNα treatment [18].Both, MBL and IFNα, are the mem bers of innate immunity system and all these evidences argue for their more tight cooperation.The regulation of expression of the Mbl genes is scarcely investigated.
Prior to start the wet-lab bench experiments we decided to analyze in more details the promoter of Mbl1 gene and to include into analysis the Mbl2 gene as the general Mbl activity in rats is provided by two genes.
The bioinformatics analysis of cis-regulatory elements in the promoters of Mbl1 and Mbl2 genes has

Materials and Methods
The sequences of 1100 bp (from -1000 to + 100 bp), containing adjacent 5'UTR of Mbl1 and Mbl2 genes of Rattus norvegicus (Gene IDs: 24548 and 64668) and Mus musculus (Gene IDs: 17194 and 17195) were chosen for the search for TFBS and putative transcription factors regulating the Mbl genes expression via these TFBS.The promoter analysis was conducted with Genomatix software tools (http:// www.genomatix.de/index.html).The search for TFBSs was undertaken with the help of the program MatInspector that is based on the usage of position weight matrices (PWM) representing the complete nucleotide occurrence probabilities and information content for each position in the sequence [19,20].The selection of TFBSs was initialized at recommended threshold > 0.85 for the core similarity and at the «optimi zed» one for the family matrix similarity.We used the special options and selected those individual matches from each family that have the scores of matrix and core similarities > 0.8 and were associated with TFs «expressed in liver», «expressed in immune system» and «ubiquitous» given that the liver successfully combines the role of a «biochemical laboratory» with the role of a major organ of innate immunity.In a physiologically relevant state, it eliminates the antigens derived from the gastrointestinal tract, aging and transformed cells and reveals tolerance to their continuous presence [21].The resulting list (List #1) contained matrix families with individual binding sites in correspondence with the cell specifi city of TFs, that potentially regulate gene expression via these binding sites.
According to the assumption that transcriptional regulation of close orthologous genes has been evolutionarily maintained to control specifi c gene expression patterns we conducted the search for conservative TFBS by comparing the Mbl1 and Mbl2 promoters of Rattus norvegicus with those of Mus musculus.The DiAlign TF program of GEMS la uncher (http://www.genomatix.de/cgi-bin//gems/launch.pl?s =8257 ac78f55b0c8449b1858b1db438 d6; GE MS=1;TASK= dialign_TF) was used to align pairwise the sequences and to display TFBSs matches within alignment.By default, the TFBSs are considered as conservative if they are identical at least for 85 % and located at the same position within the alignment.This search was made at the core similarity ≥ 0.85 and at the optimized threshold for the matrix similarity.The List #2 contained the evolutionary conservative matches in promoters of both genes irrespective of their cell/tissue specifi city.The intersecting of lists #1 and #2 yields liver-and immune specifi c evolutionary conserved TFBSs of the Mbl1 and Mbl2 genes in Rattus norvegicus.

Results and Discussion
The specifi c pattern of transcription factors bound to their cognate cis-acting regulatory DNA sequences interacts with the transcriptional machinery and enables selective gene expression.The in silico search for TFBSs in the promoters of Mbl1 and Mbl2 genes has revealed the cognate sites for the members of ten and eight transcription factors families respectively (Tables 1, 2).The disclosure of the set of TFBSs in the Mbl genes gives an idea which transcription factors might regulate the expression of these genes, and accordingly in which TF-mediated different orchestras the gene of interest may be a player.Whether it plays or not and in which orchestra -it is a question.The answer depends on the context -the kind of stimulus, cell and species specifi city, stage of development etc.
According to the set of TFBSs, the Mbl1 and MBL2 genes in the liver may be regulated by the transcription factors referring to four families shared by both genes -HNF, homeodomain transcription factors, GRs and ETS factors (Table 3).The mem-bers of these families and their disposition and collocation in both promoters are characteristic for each gene.They are mainly responsive for the regulation of differentiation, development, organogenesis, hematopoesis, proliferation of immune cells [22][23][24][25] (see Supplement for detailed information about TFs at http://dx.doi.org/10.7124/bc.0008CE).
Six families of TFs are characteristic for the Mbl1 promoter (AP1 related factors, Ccaat/Enhancer Binding Proteins, FOX, p53, NFAT, ISGF3).Among them two pairs of TFBSs are localized in close vicinity -NFAT and C/EBP (906-920 bp and 926-965 bp) and ETS and ISGF factors (22-42 bp and 42-46 bp).Such disposition of binding sites makes the basis for the binding of composite regulatory elements, NFAT-C/ EBP and ETS-ISGF.The specifi ed NFAT-C/EBP pairs of TFs are already known as functionally active composite regulators in several genes e.g. in peroxisome proliferator-activated receptor-gamma 2 gene, insulinlike growth factor 2, angiotensin-converting enzyme homolog, and transcription factor POU4F3 genes [26].Specifi c transcription factors containing ETS domains readily synergize with the interferon regulated factors making the composite regulators [27].This notion supports the plausible responsiveness of Mbl1 to IFNα.Four families of TFs are typical for the Mbl2 gene (cAMP-responsive element binding proteins, Heat shock factors, Nf-κB/c-rel and TATA binding protein).
Comparing the Mbl1 and Mbl2 specifi c sets of TFBSs the general functional difference between their transcription factors may be noted.In the Mbl1 gene they are mainly responsible for the regulation of differentiation, organogenesis, metabolic homeostasis (AP-1, C/EBP, FOX) [28][29][30], T-cell differentiation and self-tolerance (NFAT) [31], regulation of cell cycle (p53) [32] and responsiveness to IFNα (IS-GF3).Unlike them the Mbl2 specifi c transcription factors are more prone to stress response.Heat shock factors are essential for all organisms to survive the exposures to acute stress [33].The cAMP-response element binding proteins possess kinase inducible element(s) in their transactivation domain that makes them susceptible to the modifi cation by phosphorylation in response to a diverse array of stimuli [34].Nf-κB is found in almost all animal cell types and is activated in cellular responses to stimuli such as stress, cytokines, free radicals, ultraviolet irradiation, oxidized low density lipoproteins, and bacterial or viral antigens [35,36].TATA box subsumes the Mbl2 gene under the general category of TATA-containing genes with their specifi c characteristics.TBP protein binds TATA box in core promoter of genes.Together with RNA polymerase II and general transcription factors it forms the pre-initiation complex.TATA-containing genes depend more strongly on the SAGA complex (Spt-Ada-Gcn5-Acetyltransferase) whereas TATA-less genes -on the TFIID complex-dominated TBP binding [37].TATA-containing genes are characterized by a propensity for being subtelomeric, expressed at extremely high or low levels, stress-induced, and under evolutionary selective pressure [38,65].The TATA box usually evolves as a gain process in result of gene duplication.The duplicated genes are enriched in TATA-containing genes [38].The TATA box occurs in approximately 10.7 and 11.2 % of the protein encoding genes respectively in the mice and human genomes [39].
The Mbl1 and Mbl2 genes are paralogues arising from the gene duplication.According to our calculations, the nonself BlastP hit has an E-value less than an E-value cutoff of 1.0  10 -20 that supports the mentioned notion.Like a typical TATA containing gene, Mbl2 is localized in the subtelomeric region of chromosome 1 in the rat genome (see NCBI Map viewer, annotation release 105).Also the Mbl2 gene was/is under evolutionary selective pressure as Old World monkeys still have both genes whereas chimpanzees like humans have only one functionally active gene MBL2 homologous to the rat Mbl2 gene.The Mbl1 isoform in rodents and some primates is homologous to the human MBL1 pseudogene with low level expression of truncated protein [40].Lynch and Conery [41] have suggested that the duplication of genes is a relatively frequent event in evolution and typically these duplicated genes are lost because of the lack of selective pressure to maintain both copies.
Therefore on the basis of carried out analysis we may suggest that the Mbl1 and Mbl2 genes may differentially respond to the intra-and extracellular factors with Mbl2 more prone to the stress-induced response.Our preliminary data have revealed that the Mbl2 gene is induced to substantially greater extent than the Mbl1 gene in the liver of rats treated with IFNα.The presence of Nf-κB binding site in promoter of the Mbl2 gene may defi ne the Mbl2 gene responsiveness to IFNα via the PI3 kinase pathway [42].
Complement has long been appreciated as a rapid and local immune surveillance system.However, new research has ascribed many new functions of complement that extend far beyond host defense and infl ammatory processes [reviewed in 8, 43 and 44].The variety of TFBSs in the promoters of both genes partly explains this versatility and provides the basis for new associations between the lectin complement system and other systems in organism that may be ahead.
The lectin pathway of the complement cascade is an evolutionary ancient form of immune system.It

Mbl2specifi c
cAMP-responsive element binding proteins, Heat shock, NF-B/c-rel, TATA-binding protein emerges in the early Cambrian in Tunicates [4,8] and developed in parallel with the transcription factors predating complement appearance, developed simultaneously with it and emerged later during evolution process.We have screened the literature to check whether both genes differ in the «age» of their TFs, particularly their DNA binding motifs.The «youngest» motifs refer to the TFs common for both genes.ETS domain was detected in Porifera [45]; homeodomain -in gastropod mollusks referring to late Cambrian [46]; glucocorticoid receptor -in Chondrichthyes or cartilaginous fi shes which appeared about 395 million years ago, during the middle Devonian [47].
Most of DNA binding motifs of TFs that might regulate the expression of Mbl1 gene are more ancient except NFAT and IFN system.bZip motif inherent to the AP1 related factors and Ccaat/Enhancer Binding Proteins was detected in the period prior to the divergence of the metazoa and fungi that is long before the emergence of lectin pathway [48].The p53 superfamily predates animal evolution and fi rst appears in unicellular Flagellates.In the invertebrate models amenable to genetic analysis , the p53 superfamily members mainly act in apoptosis regulation in response to genotoxic agents and do not have overt developmental functions [49].The appearance of the winged-helix FOX motif refers to the same period [50].NFAT domain was detected much later at Cephalochordata, jawless or agnathan fi shes as the earliest known members of Vertebrata that appeared during the Ordovician Period (510-439 Mya) [51].The IFN system originated in early vertebrates (ca.385 millions years ago in the Devonian Period) and is conserved in all tetrapods as well as in fi shes but not in Tunicates [52,53].
The DNA binding motifs of TFs that might regulate expression of the Mbl2 gene arose also early in evolution.About bZip motif of cAMP-responsive element binding proteins see above.Heat shock factors and chaperons originate in evolution approximately 3.5 billion years ago, because they are present in archaebacteria as well as in bacteria [54].One of the major destructive stresses faced by all cells was the problem of reactive oxygen following the achievement of high levels of atmospheric oxygen approximately 2.2 bya.The nuclear transcription factor Nf-κB is induced by oxidative stress and functions to protect diverse cells from apoptotic events.Defense systems containing homologous elements, in fact, are widely distributed among plants, protozoans, echinoderms, protostomes, lower vertebrates and mammals [54].
Therefore, the Mbl1 and Mbl2 genes in Rattus norvegicus co-opted the ancient DNA binding motifs and acquired the «younger» ones to perform their versatile functions.