Document Type : Original Article
Authors
Department of Biochemistry, Faculty of Pharmacy, Minia University, Minia 61519, Egypt
Abstract
Keywords
Main Subjects
Introduction
To maintain robust immune system capable of combating almost any infectious body, the V(D)J recombination process has been developed in vertebrates to produce Igs and TCRs with high diversity. This occurs through cleavage and rejoining of several segments of coding genes. The process starts with binding of RAG1/RAG2 complex with the recombination signal sequence (RSS) to induce DNA breaks adjacent to the coding segments. There are two types of (RSS)s: 12-RSS and 23-RSS according to the length of spacer separating two conserved domains: heptamer and nanomer. The cleaved coding ends are rejoined by the non-homologous end joining (NHEJ).
The main components of both RAGs are the indispensable core and the helper non-core [1]. The basic modules of RAG proteins are well characterized especially after the obtainability of their high-resolution structure. RAG1
protein comprises 1040 aminoacids which can be reduced to core RAG1 (cRAG1) encompassed from aa 384 to aa 1008. Many studies have utilized cRAG1 instead of the full length RAG1 (flRAG1) because of its sufficiency for the V(D)J recombination and the ease of its purification when compared with the less soluble flRAG1 [2].
Non-core region of RAG1 contains an important domain: C3HC4 RING (really interesting new gene) finger located in the range (aa 265-383) [3]. RING domain has E3 ubiquitin ligase activity which intimately linked to overt V(D)J recombination[4, 5]. The key residue in this region is cysteine 325 which when mutated leads to inhibition of the RAG1-mediated ubiquitin ligase activity and blockage of lysine residue (K233) autoubi-quitylation in RAG1[6] and the Histone H3 ubiquitylation by RA[7,8]. This role highlights the regulatory importance of the non-core region.
While seven main modules have been defined in the core region. The first domain is the nanomer-binding domain (NBD) which intermingles with the RSS nanomer. Then, the dimerization and DNA binding domain (DDBD) which is linked to NBD through a flexible part, both modules form together the main RSS binding and RAG1 dimerization sites to ultimately form the RAG tetra complex. The additional five modules are (PreR; Pre-RNase H, RNH; RNase H; ZnC2, ZnH2 and CTD; carboxy-terminal domain) signify the active catalytic site containing three acidic aminoacids (D600, D708 and E962) known as conserved DDE motif [9]. DDE motif cast the divalent metal factor, particularly magnesium ions, in the RAG1 active site [10-12]. The cysteine and histidine residues found in ZnC2 and ZnH2 together form Zn binding site essential in RAG1 dimerization [13] and for DNA cleavage [14]. The seventh module, the CTD, flaps on itself to associate with the DDBD in the DNA binding and RAG1 dimerization[9, 15, 16].
RAG2 acts as a regulator in the V(D)J recombination by managing and modulating binding and cleavage activities of RAG1 by enhancing its specificity to RSSs[17]. Like RAG1, the full length RAG2 (flRAG2), containing 527 aminoacids, can be truncated to cRAG2 which extends from the N- terminus to aa 382, and consists of antiparallel β-sheets forming a six-bladed β-propeller like structure. Each blade is formed by a Kelch-like motif [18]. The “doughnut” shaped core with flat surface is suitable for facilitating protein-protein interaction [19]. Nevertheless, RAG2 has no direct contact with DNA in the lack of RAG1, RAG2 binds with RAG1 close to its active site, contributes to stabilization of RAG1/RSS heptamer interactions and controls DNA cleavage[20]
The regulatory non-core RAG2 extends from aa 383 to the C-terminus. Most eminent domain in the non-core region is a plant homedomain (PHD) finger (from aa 414 to 487), found between an acidic hinge and a C-terminal extension [18]. PHD finger connects to tri methylated lysine 4 on histone H3 (me3K4H3) through the aromatic channel formed from three aromatic residues (Y415, M443 and W453) in this region. Therefore, PHD/me3K4H3 binding spots the points of active chromatin and directs the RAG complex to the accessible RSSs [21], leads ultimately to augmentation of the binding and the cleavage activities of RAG1/RAG2 complex[22-24]. The acidic hinge, found between the core region and the PHD finger, has a role in the allosteric inhibition for RAG1/ RSS binding. Binding of PHD to H3K4me3 induces some confor-mational changes in both RAG1 and RAG2 and counteracts the autoinhibition effect of the acidic hinge[24]. Last part of non-core RAG2 is the C-terminal extension and its primary function is to regulate RAG2 degradation in the cell cycle[25].
Recently, several high-resolution structures of the RAG/RSS complex have been resolved by cry-EM and X-ray crystallography, giving a more in-depth image into the molecular mechanism of the recombination process. The RAG proteins form a Y shaped structure within which two molecules of RAG1 interwind with the heptamer and the nonamer of the RSSs and represent the ‘Y’ stem, while RAG2 is on the tip of ‘Y’ branches as it associates with the two coding ends.
Mutations in RAG1/RAG2 associate with wide spectrum of clinical and immunological phenotypes [26]. According to the severity of T-cell impairment, the inherited disorders of the cellular and humoral immunity are classified by the International Union of Immunological Societies (IUIS) into “severe combined immunodeficiencies” (SCIDs) and other combined immunodeficiencies (CIDs) with less severe immunological disorder[27]. Relevant bodies have developed criteria for diagnosis of typical and non-typical SCID, the later has less impaired T-cell number/function, and of other forms of CIDs [28, 29]. Different distribution of genotypes was described for SCID and CID [27, 30, 31].
Homozygous (biallelic) RAG mutations were primarily related to SCID with absent T and B cells (T-B-SCID) [32]. Then, the RAG deficiency manifestations widened to include Omenn syndrome (OS), the autosomal recessive disease in which there is a remnant RAG activity permitting generation of oligoclonal T cells[33, 34]. Infants diagnosed with OS, often have severe infections, erythroderma, lymphadeno-pathy, hepato-splenomegaly, eosinophilia, severe decrease in level of Immunoglobulins except IgE mostly increases. In these cases, interventions e.g., the haematopoietic stem cell transplant (HSCT) can be lifesaving [35]. Typical OS shows absent B cells and elevated autologous autoreactive T cells that infiltrate different organs. Surprisingly, the T-cells invading different organs, carry different T-cell receptors specificities, implying tissue-specific self-antigen led expansion of T-cells clonotypes[36].
Hypomorphic RAG mutations may also result in atypical SCID (AS) [37], in which diverse but reduced number of B and T cells have been observed. Patients with AS may manifest severe infections and autoimmunity.
These manifestations may exaggerate in patients with RAG deficiency to develop autoimmune hemolytic anemia post cytome-galovirus (CMV) infection[38, 39].
More lately, RAG mutations have been found in patients with delayed-onset disease demons-trating granuloma and/or autoimmunity (CID-G/AI) phenotype [26, 40-46]. In this case, T and B cells are usually present, severe early onset infections are uncommon. In contrast, late-onset infections are life-threatening [26, 40]. Particularly, recurrent respiratory tract infections caused by herpesviruses and human papillomavirus [26, 40].
Material and methods
Programs used.
Pymol
To study the effect of some of these clinical mutations on the structure of RAG complex, a protein modelling program called Pymol was used[47]. The crytostructures were available in Protein Data Bank (PDB). To open the PDB file of certain RAG complex cryostructure in Pymol, there were two options: PDB file was downloaded directly or by the aid of Pymol. PDB files can be downloaded from https://www.rcsb.org or from the Pymol main window (option ‘Get PDB’ was chosen from ‘File’ menu) by typing the 4 letter PDB code (e.g 6cg0, 6v0v…) in the search tab. The rcsb PDB website contains all data concerning the cryo-EM structures under investigation. The PDB file was opened from ‘file’ menu.
‘Display’ menu was opened then ‘Sequence’ was chosen to display the exact sequence for each chain of the complex.
From the same menu, ‘Sequence Mode’ was selected then ‘Residue Names’ was chosen to display the amino acid sequence, each amino acid was presented as a code of three letters. 6cg0 cryo-EM structure of RAG complex is for mouse RAGs. Regarding the core RAG1, there are 3 amino acids difference between human and mouse protein. For example, R474 in the case of human RAG1 corresponds to R471 in the mouse RAG1. The wizard in the menu bar was chosen and mutagenesis then protein was selected. Pymol uses a measure called Match Align to express the degree of alignment between the original and the mutated structure. When comparing the effect of two independent mutations, the higher MatchAlign score, the lower effect of mutation compared to the original structure.
Patients and Methods
Fathmm (belongs to University of Bristol, Integrative epidemiology unit)
Advanced online tool for predicting the magnitudes of different alleles. Increase the negative sign means increase the damaging effect of mutation (fathmm - Analyze dbSNP/Protein Missense Variants (biocompute.org.uk)
Polyphen-2 (belongs to Harvard University)
It is a bioinformatics tool available online designed by Harvard University to analyze 3D structures of protein. It gives a score for each mutation by computing the difference between the original and the new allele. For interpretation of PolyPhen results, score [2.00 or higher] is probably damaging, [1.40-1-90] is possibly damaging, [1-1.5] is potentially damaging and [0-0.9] is benign[48] PolyPhen-2: prediction of functional effects of human nsSNPs (harvard.edu).
Support vector machine [SVM]-based tool Mutant-1.
I-Mutant v2.0c predicted protein stability after each mutation. It displayed details as reliability
index, temperature and PH [49]. Negative score indicates decrease protein stability while positive score means increase protein stability.
PROVEAN (J. CRAIG venter institute, California)
An online tool that asseses the effect of mutation in the protein function. Increase the negative sign means increase the damaging effect of mutation [50]. provean-software [ILRI Research Computing] (cgiar.org)
Sorting intolerant from tolerant [SIFT] (belongs to Institute of Bioinformatics, Singapore)
SIFT bioinformatics software was used to detect whether the mutation is tolerable or deleterious according to its effect on the protein function. Scores [0-0.05] indicate deleterious mutation, while scores [0.05-1] near one suggest tolerable mutation [51]. (http://sift.jcvi.org/)
CADD (Combine annotation dependent depletion)
It is software for detecting the mutation effect on the protein structure. Higher CADD scores indicate that the variant is more likely to be deleterious than neutral. Score 10 implies that the mutation is one of the highest 10% pathogenic alleles, 20 or higher indicates that the mutation is one of the top 1% pathogenic alleles[52]. CADD - Combined Annotation Dependent Depletion (washington.edu)
The website Ensemble uses SIFT, Polyphen-2 and CADD to predict the effect of mutations.
Results
Results are presented in the form of tables.
Results of Pymol
Discussion and conculsion
RAG1/2 mutations are greatly involved in the development of immunodeficiencies. This highlights the vital role of these protein and the V(D)J recombination in maintaining powerful immune system. Advances in the crystallo-graphic microscopes enable us to study the spital structure of RAG1/RAG2 complex. The protein modelling programs as Pymol allowed us to make changes in this structure and study how these changes affect the structure stabililty and its intra and intermolecular interactions. Kim et al., have developmed a cryo-structure of RAG/RSS complex[53]. They found that M435W affect the structural integrity of NBD. R507W mutation ocurrs due to exchange basic arginine with neutral tryptophan which may affect solvent solubility and DNA binding. W522C affects the structural integrity of PreR. H612R disordered near RAG2. R737H affects possibly the DNA binding. R778Q affects the structural integrity of RAG1/2. R841Q is located near heptemer so affects DNA binding. Both, F974L and R975W affect the structural integrity of CTD. M1006V is in the domain interface of CTD-DDBD[53]. Regarding Pymol analysis, MatchAlign score represent how the wildtype(wt) protein matches with the mutated protein after alignment. The lower score indicates larger effect of a mutation. In the group of mutations we analysed, H612R has the highest score while R841Q has the lowest score. Regarding Fathmm, R841Q with the highest negative score so it is the most damaging while F974L is the most tolerable. Polyphen-2 showed that R507W, R975W, R522C and R737H have the highest score (one) in addition to R778Q, R841W, A857V and F974L (score >0.9) so they are probably damaging, while M435W (score = zero) is benign and R1006V and H612R are moderate. SVM-1 showed that all mutations decrease protein stability (F974L has the most effect) except R507 which increases protein stability. Provean showed that all mutations are deleterious (W522C is the highest one) except H612R and M1006M are neutral. CADD scores showed that all these variants, except M435V, belong to the top 1% pathogenic mutations so M435V is the least deleterious while the highest score is for R841W. Trying to co-relate these findings with that published in the online reperitories, we found that M435W tends to be bengin/tolerable according to pymol and other online tools (except SVM-1) but it is described as pathogenic/likely pathogenic in NCBI which gives a discription for the clinical significance of all mutations except A857V and F974L which are found in HGMD but not in NCBI. A857V was found in chinese patient with Omenn Syndrome[54] while F974L was detected with R841Q in a patient with early onset autoimmunity[43]. Analyzed mutations mentioned in NCBI are described to have pathogenicity except M1006V (submitted as likely bengin 3 times and of uncertain significance) and H612R submitted once as uncertain signficance. According to Pymol, MatchAlign score of M1006V lies in the middle among other mutations while H612R has the highest MatchAlign which means the highest matching with the wild type protein, these findings do not conflict NCBI discription. Other online tools showed that M1006V is tolerated/bengin/neutral while H612R has conflict results that tends more to be tolerable from my point of view. In conculsion, the analysis of mutations performed in this study is concurred with the clinical discription showed by NCBI except in case of M435V.
References
interim analysis. J Allergy Clin Immunol, 2017. 139(4): p. 1302-1310 e4 DOI: 10.1016/j.jaci.2016.07.040.