Cite this asDas P, Bhadra MP, Bhadra U (2017) GAGA Factor Expedites Development in Drosophila. Peertechz J Biol Res Dev 2(1): 004-011. DOI: 10.17352/ojbs.000009
The development in Drosophila is a concerted mechanism, occurring via the interplay of a constellation of genes and factors, operating in intricate synchrony. These factors, produced at precise points in their developmental cycle, operate via the activation, through binding to the various transcription factors. The GAGA factor (GAF) is such a product of the trithorax-like gene, Trl, which binds to a consensus DNA sequence for the modulation of the homeotic gene functions. Besides this, the factor has a role in chromatin remodelling; through binding with the Polycomb responsive element (PRE). The protein has a unique structural conformation with a zinc-finger DNA-binding, a BTB/POZ and a polyglutamine-rich Q domain. It has a unique role of acting as an anti-repressor of the gene, Kruppel, releasing the repression on it by the other DNA binding proteins. This report accomodates the interplay in which the GAGA factor is involved in the Drosophila embryonic development.
The GAGA factor (GAF), produced by trithorax-like gene, Trl, in Drosophila  operates by binding to its DNA recognition sequence, having a consensus sequence, GAGAG . As a transcription factor, it has been found in the promoters and enhancers for the modulation of expression of various genes like the homeotic genes and developmental genes, like the Ultrabithorax (Ubx), engrailed (en), fushi-tarazu (ftz) even-skipped (eve), hsp70, hsp 26, H3/H4, Adh, E74, actin-5C, and 1-tubulin [1,3-7]. Besides these roles, the factor is also involved in remodelling of the chromatin, in functions pertaining to the Polycomb responsive element (PRE), and insulator or boundary elements [5,9–14]. This factor shows a high proportion of pleiotropism owing to its localization on the promoter regions of a multitude of genes [8,9]. The gene, Trl is presumptively produced maternally  hence, a better understanding of its target genes is made by studying the phenotypes using combination of hypomorphic and hypomorphic or null Trl alleles. The expression of Trl is fundamental to all the stages and all tissues of the fly, even if the levels of mRNA change significantly in these stages . Till date, the studies elucidating the roles of GAGA factor have been made in the embryonic stages. Studies on larval and imago stages have been very scantily made by knocking down or over-expressing different genes. Such experiments have clearly pointed that GAGA factor is an active player in the wing disc and salivary gland development. Immuno-fluorescence studies of the salivary glands polytene chromosomes in Drosophila showed that this factor binds to a large number of euchromatic genes; hence it points out a role of GAGA factor in the maintenance of open, transcriptionally active chromatin regions.
The GAGA factor was for the first time identified , as an in vitro activator of the promoter of the gene, Ultrabithorax in Drosophila. Later on, it found to bind to GAGA elements (having stretches of GAGA or CTCT) over the hsp70 heat shock promoter and H3/H4 histone gene promoter regions [3,12-14]. Still further reports showed it to have crucial roles in the activation of genes and in regulating the chromatin structure [3,10,14-16]. The GAGA factor also show an interaction with the promoter sequence upstream of genes, E74 , his3, hid,hsp26, and hsp70  in Drosophila. However, the activation of transcription is found only for the genes, Kr and Ubx. Exceptionally, Kr gene is actively transcribed when the GAGA factor binds to the anchoring site located downstream to the target Kr gene.
GAF bears a very unique structure having a single zinc finger DNA-binding domain, a BTB/POZ domain, a known protein–protein interaction motif , and a polyglutamine-rich Q domain [16,19]. The gene encodes 2 alternatively spliced isoforms of the GAF-GAGA-519 and GAGA-581 of varying length and glutamine-rich C-terminal domain sequences. Each of the isoforms have 3 distinct structural domains: a zinc finger DNA-binding domain (DBD), a broad complex tramtrack bric-a-brac/ poxvirus domain (BTB/POZ), that involves in protein–protein interaction, and the C-terminal Q domain of a glutamine-rich C-terminus area involved in transcription activation [20-22] (Figure 1). Contrary to initial notion, of the Q domain acting as a transcriptional activator [18,23,24], GAF, in fact operates as an anti-repressor. It is constituted of the following components:
The DNA-binding domain or the DBD: The C-terminus of the GAF possesses a single classical C2-H2 zinc finger DNA-binding domain of 82 amino acid residues from 310–391 after a basic helix flanked of 3 short tracts of basic amino acid residues: BR1, BR2, and BR3 [20,21]. The consensus binding sequence bears a minimal penta-nucleotide, GAGAG  of which the tri-nucleotide sequence GAG is essential for recognition and binding . Once that a zinc finger binds the major groove, identifying the GAG sequence, it gets stabilized through interactions with an extension of basic amino acids, at the N-terminal [20,21]. The tract BR1 also wraps around the minor groove with Adenine in the fourth position of the sequence GAGAG .
BTB/POZ domain: The BTB/POZ domain of the GAF occurs in all groups of organisms has 122 amino acids located in the N-terminal region of the GAF. This domain provides an interface for the occurrence of protein–protein interactions during transcriptional activation and transcriptional factor repression [18,26-32]. The protein oligomers are actually the result of 3 specific residues- D35, G93, and L112 and there are 3 -helices adjacent to these residues that help in capping, stabilizing the dimerizing at interface . The GAF BTB/POZ domain in GAF forms higher order functional multimeric complexes by virtue of self-association [22,33,34].
Q domain: The C-terminal bears a Q domain that have been involved in promoter distortion, single-strand binding and multi-merization . Because of the single-stranded DNA binding affinity, GAF is able to interact with triple-stranded DNA . In Drosophila S2 cells, the Q domain was found to play a role of trans-activation domain. Combinatorial studies indicate that the Q domain is primarily involved in the larger GAF complex formation, but is not needed for heat shock response functions of the gene, hsp83 [19,35].
When the GAGA factor binds to the major groove of DNA and the zinc finger of DBD, additionally interacting with the basic region BR1 at the minor groove , it results in a distorted DNA via the interactions of the Q domain at the promoter regions that melts the duplexes , especially when GAF binds to the single- and triple-stranded DNA over stretches of (GA)n [35,36]. The DNA binding over multiple GA or CA stretches is due to the multimer formation caused by Q domain  and the BTB/POZ domain . GAF binding sites are more frequent in the introns that suggest a possibility of regulatory role in transcriptional elongation . The protein may to bind to long GA-rich repeats as evident in case of satellite DNA [38,39]. GAF can cause the maintenance of heterochromatic regions in the transcriptionally repressed state by recruiting arrays of GAF at the high density binding sites for GAF.
Being a multi-functional protein, GAF, can interact with a significant number of partners executing functions apart from binding single, double, and triple stranded DNA, such as providing multiple binding sites and support for the cofactors to form a feasible functional complex. GAF has depicted to possess the ability to show self-associate as well as associate with other BTB/POZ functional proteins like tramtrack (Ttk), Pipsqueak (Psq), and batman (ban) [6,40,41].
Tramtrack or ttk, is a gene regulating development, that codes for a protein important in oocyte development [A factor that regulates the class II major histocompatibility complex gene DPA is a member of a subfamily of zinc finger proteins that includes a Drosophila developmental control protein . The protein, Ttk may be expressed as two proteins-p69 and p88, that bind to the regulatory regions of several segmentation genes , like ftz to show novel expression pattern in embryos. During the transition from the mitotic to endocycle in the follicle cells, ttk promotes the activity of JNK pathway by passing the notch signalling pathway . The ttk also codes for proteins that represses the expression of engrailed with the help of runt . The ttk, thus represses several Drosophila genes, the activity of such genes are counteracted by GAF in vivo, which functions as anti-repressor .
Psq, essentially regulates development via recognition of the GAGA sequences. However, it needs the GA stretch to be longer than the GAF . Psq has been found to co-localize with GAF for numerous loci in the polytene chromosomes , where it and GAF, being members of the polycomb group of complexes, bind to the polycomb response element of the bxd gene [45,46]. The interaction between them occurs through their BTB/POZ domains .
The batman or ban co-localizes with GAF and causes the activation and repression of homeotic genes . Besides ban, the proteins, Corto and the sin3-associated polypeptide SAP18, form complexes with GAF’s BTB/POZ domain [47,48] and causes histone deacetylation. GAF interacts with the large subunit, NURF 301 of the NURF [49,50], dSSRP1 subunit of FACT [51,52].
Developmental activities of an organism are under the stringent epigenetic maintenance of the genomic expression. The complex called “facilitates chromatin transcription” or FACT in Drosophila, interact with GAGA factor to modulate the structure of the chromatin expression. The gene expression is epigenetically maintained through post-translational modifications of the four core histones on the nucleosome . The expression of two such epigenetic genes- the Hox gene, whose expression is governed by Polycomb and trithorax group genes [54,55] and the position effect variegation (PEV) [56,57], has been extensively studied. Both these phenomena involve silencing and counteracting the maintenance of the active state. The methylation of histone H3 at K27 and/or K9 cause silencing, which then leads to the binding of the Polycomb group or heterochromatin proteins that recognize these silent marks [55,57-60].
The GAF and FACT replace histone H3 into H3.3, by associating HIRA with d1, resulting in the maintenance of expression of the white gene in the heterochromatic environment. The active state is sustained through the replacement of Histone H3.3 by the removal of a nucleosome, replacing with H3.3 containing nucleosome with stepwise assembling and disassembling of a nucleosome at the DNase-hypersensitive site of d1. Heterochromatin formation is marked by K9-methylated histone H3 and its binding protein HP1, and has a tendency to spread into neighbouring regions [57,58]. The process of reassembling of the nucleosome, following the replacement of histone, keep on removing K9-methylated histone H3 at d1 and prevent the spreading of the heterochromatin (Figure 2).
The development of gonad is possible through the migration of primordial germ cells (PGCs). The germ cell progenitors are formed at the posterior pole of the embryo in the region where the maternal germplasm reside. Once specified, after several mitotic divisions, these cells separate from the embryonic syncitium and move inside the embryo [61,62]. The migration of PGC has two stages- passive and active. While passive migration occurs during morphogenetic events of gastrulation, leading to the shifting of germ cells to the midgut pocket by cellular invagination of the prospective endoderm, active migration takes place within the embryo via the autonomous movement of the PGCs. PGCs cross the primary gut wall forming pseudopodia like structures and then join the mesodermal cells to form the somatic progenitor cells of the gonads, localized to either sides of the ventral groove in the fifth abdominal embryonic segment [61-63].
The GAGA factor encoding gene, Trl is expressed in various stages of Drosophila are found to regulate several developmental processes like embryogenesis, oogenesis, eye and wing development, and formation of dorsal processes [2,10,64-67]. In developing male germ cells in Drosophila, a loss-of-function in Trl has been found to lead to partial germ cell loss . A recent study pointed that a mutant for Trl shows an early activation of migration and movement to the inward of the embryo. Such a premature drift lead to loss of orientation and an absence of a normal gonad formation. The migrations of the cells are under the JAK-STAT signalling pathway factors that activate the GPCR coded by gene tre1 or trapped in endoderm 1 [69-71], whose product is normally limited to the germ cells. In some cases, the somatic environment can modulate spatiotemporal regulation of germ cell migration as in mutants of the hopscotch gene, encoding a Janus kinase, causing a premature migration of primordial cells . This suggests that the GAGA factor does influence the migration of primordial cells via their contact with the somatic cells.
The over-expression of GAF has been found to alter the gene expression of many genes in the wing disc. Depletion in GAGA, caused by the deletion at 69B was found to consistently reduce the size of the wing by about 10 %. Using NubbGAL4 with Dicer2 reduction was by 55% and in case of ptcGAL4 by 35%. Presumptively the wing size diminished was due to the defect in the proliferation of the cell. The abdominal segment A6 showed a transformation to A5, in males having a depleted GAGA factor. It is well known fact that a defective Abd-B is responsible for homeotic transformation of the abdominal segments. The above findings thus correlate that GAGA factor was capable of initiating a defective transformation of segmentation of the embryo. A heterozygous null Trl male was found to have a similar transformation owing to the deficiency (Trl67/Df(3R)Sbd26) . A phenotype of loss-of-function due to usage of pannier GAL4 (pnrGAL4) showed a cleft with loss of bristle at the dorsal of the notum while over-expression was found to completely affect the dorsal closure with embryonic death before the first instar stage is reached.
Kruppel is the Drosophila gap gene that plays a crucial role in the early embryonic development by forming the antero-posterior boundaries. The syncitial blastoderm stage shows the expression of the Kr in the form of a circular band girdling around 45 to 55% of the egg . The Kruppel transcripts are formed at around 2 to 5 hr later from hatching  and protein coded thereafter bears zinc fingers [75-77] that in turn regulate the expression of the other genes, both spatially and temporally.
The analysis of the proximal promoter of the Drosophila Kruppel (Kr) gene shows a 44-base pair length fragment bearing the RNA start sites having prominent promoter activity. This promoter has both upstream and downstream flanked by sites binding with the GAGA factor. The GAF interacts with the Kr promoter region downstream in a sequence-specific fashion, and the purified protein is found to activate the in vitro transcription of Kr and Ubx. The GAF is acting as an anti-repressor acting in the presence of the binding site to repress the inactivation of the Kr by undoing the repression by the DNA binding factor. The transcriptional anti-repression model of the gene describes that DNA binding factors may bind with the gene’s promoter and enhancer sequence repressing the gene function. GAGA factor is able to nullify the repressive effect of such factors resulting in allowing transcription of Kr.
The development of both the anterior and posterior poles, which are the terminal domains of Drosophila embryos, is specified by the maternal terminal system . One of such gene is tailless, which is crucial for the development of terminal structure like telson and the posterior gut as well as head portions as head structures and the brain development [79-82]. The syncitial blastoderm stage of the embryo expression this transcript at the poles occur after the indirect activation by the maternally produced Torso receptor tyrosine kinase pathway at the embryonic termini. It partly relieves the repression caused by the HMG transcription repressor, Capicua and the co-repressor, Groucho [83,84]. The other repressor that it negates is the BTB domain zinc finger protein, Tramtrack69 . A successful functional tailless ensures the normal expression of of other gap genes such as Kruppel and Knirps, and later on genes like hunchback , brachyenteron, and forkhead [79-82]. The reduction of concentration of the repressor causes a loss in a well-defined edge of expression domains [86,87]. It is seen that if the binding affinity of GAGA factor to the tor-RE is low, and multiple tor-REs are present in the tll cis-regulatory region [85,88] the boundary of tll expression gets poorly-defined.
The DNA-binding protein Zelda or Vielfaltig in Drosophila is an active transcriptional activator of the zygotic genome and produces an open chromatin state. Once that, the chromatin open, it facilitates the recruitment of the transcriptional factor. This leads to the remodelling of the genome, causing the target gene to express. But sometimes absence of the Zelda can also lead to transcriptional activation in case of factors like the GAGA factor binding motif and the GAGA factor in embryo. The 14th nuclear cycle is the testimony to the initiation of zygotic transcription, where Zelda is key activator of the zygotic genome in the maternal to zygotic transition [89,90]. Zelda binding sites have been found to be critical for regulation of DNA binding by transcriptional factors Dorsal (Dl), Twist (Twi), and Bicoid (Bcd) [91-93]. The Zelda also potentiates transcription factor binding sites by determining sites of open chromatin [94,95]. The density of histone H3 increases when the Zelda decreases in wild-type embryos , the Zelda thus, dictate the expression of the initial set of zygotic genes, transcribed post fertilization and also binds to the locus for genes that need to be activated later such that a precise sequence of gene activation ensues during gastrulation.
Although the essential and conserved role of PcG/trxG homolog was clearly proven in the Drosophila melanogaster, the vertebrate homologue for the Drosophila GAGA factor was unknown until the recent studies . The recognition sites for the GAGA factor called the GAGA boxes were found in many genes in the vertebrates including the hox complexes but a putative GAGA factor was yet to be discovered [16,97-101].Very recently, the vertebrate GAG factor homologue having properties similar to the Drosophila in terms of domain structure and capacity for DNA recognition and binding. Various analysis including the structural modelling, phylogenetic analysis and cross-reactivity studies have exposed that GAGA factor can bind to DNA sequences rich in GAGA in the hox complex. In mouse and human, cKrox (Kruppel-related zinc finger protein cKrox) or Th-POK (T-helper inducing POZ/Kruppel-like factor) was deemed as the homolog of the Drosophila GAGA factor. It is found to be encoded by the zbtb7b or Zfp67 gene. Th-POK mainly regulates the commitment towards the lineages CD4 and CD8. It is found that the mice having a mutated ArgineY Glycine in the X position of the second zinc finger are of Th-POK show immune-compromisation.This means that the Arginine is responsible for the recognition of the invariant G of the target and any mutation in the third position of the consensus sequence, GAGAG alters the target binding specificity . This highlights the role of G in binding and arginine in specific DNA binding, both in fly as well as in mammals. The binding sites of c-Krox/Th-POK are found to be rich in purine with the pentamer sequence GAGA in the target. Some studies have also shown that Th-POK/c-Krox binds to the collagen promoter region to cause transcriptional activation [102-104] and the deletion of the C-terminal region, the transcriptional activity is reduced in mice .
The Evx2 and Hoxd13 genes have tracts rich in GA in mouse, human, and zebrafish and functions in blocking the enhancer in both transgenic flies and cultured human cells . Mutating the GAGA binding sequence prevents it from functioning as an insulator . The murine Hox clusters with Histone free regions associate with the GAF recognition sites and regulate the binding of Th-POK. Thus, the mammalian GAGA factor act in nucleosome reorganization at the Hox clusters providing a platform for binding regulatory proteins, organizing chromatin regulatory activities of the chromatin, including the formation of boundaries.
The evidence cited in this report, clearly indicates that the GAGA factor is involved at several levels in gene expression regulation. Hence, it would be improper to consider it as a simple transcription factor or anti-repressor. Its role as a structural protein on the chromatin conformation, from its primary to its tertiary structure, depicts a potential role as a transcriptional activator and repressor. The actual role of GAF in maintaining the secondary and tertiary structure still remains more speculative than quantitative; even the functional significance of GAF multimers still remains cryptic. The multimers may affect the topology of the regulatory region where they bind, changing the rotational phase of the nucleosomes to enable the proteins to co-interact. The purification of GAF and the fully characterized chromatin system may reveal the full structural analysis data. The chromatin folding and cis-trans interaction of the regulatory sequences may also give additionally unique information about the effect of the GAGA factor in the folding of the chromatin.
The diversified role of the GAGA factor in binding sequence-specifically to chromatin, DNA, transcriptional factors, metal ions and leading to protein homo-dimerization as well as hetero-dimerization has very critical roles in the normal execution of biological processes like the cell division, chromatin assembly, chromatin modification, chromatin organization, dosage compensation, imaginal disc-derived wing morphogenesis, mitotic nuclear division, negative regulation of transcription, nuclear division, oogenesis, positive regulation of chromatin silencing, positive regulation of transcription, positive regulation of transcription from RNA polymerase II promoter, protein oligomerization, sensory perception of pain, spermatogenesis and syncytial blastoderm mitotic cell cycle. Such a multi-faceted protein does require further detailed analysis for still some functions not yet discerned completely. The numerous avenues that this factor enlightens its role in, clearly pinpoint the possibility of exploiting this factor in abating some diseases that are caused by an aberrant cell cycle progression, like the various cancers, cell proliferation, tumor development and pattern formation, during embryogenesis. Thus, it still requires focussed and multi-disciplinary efforts to dissect the yet unknown transcriptional regulatory mechanisms that regulate Drosophila development.
The work was supported by an HFSP Young Investigator Grant (RGY020), CSIR Net work Project (BSC 0108, 0121) awarded to UB and Wellcome Trust International Fellowship awarded to MPB (GAP0065).
The initial write-up was done by PD. Figures were drawn by PD. Final write-up and the concept was contributed by UB and MPB.
Subscribe to our articles alerts and stay tuned.