Gene transcriptions/Core promoters

A core promoter is that portion of the proximal promoter that contains the transcription start sites.

The diagram shows the RNA polymerase II holoenzyme attached to the DNA template strand. Credit: ArneLH.

Biochemical definition: the minimal stretch of DNA sequence that is sufficient to direct accurate initiation of transcription. An acceptable range of the length of a core promoter is typically 60 to 120 base pairs.

Genomics definition: short sequences surrounding the transcription start sites (TSSs).

It contains a binding site for RNA polymerase (RNA polymerase I, RNA polymerase II, or RNA polymerase III) holoenzymes.

A vast network of regulatory factors that contribute to the initiation of transcription by RNA polymerase ultimately target any specific gene’s core promoter.

The core promoter includes the transcription start site(s) (TSS).

That portion of the core promoter that is upstream of the TSS is also part of the proximal promoter.

The core promoter is approximately -34 bp upstream from the TSS. "Several factors have been identified that bind to core promoters (reviewed in Smale, 1997)"[1][2].



Genetics involves the expression, transmission, and variation of inherited characteristics.

Gene transcriptions


DNA is a double helix of interlinked nucleotides surrounded by an epigenome. On the basis of biochemical signals, an enzyme, specifically a ribonucleic acid (RNA) polymerase, is chemically bonded to one of the strands (the template strand) of this double helix. The polymerase, once phosphorylated, begins to catalyze the formation of RNA using the template strand. Although the catalysis may have more than one beginning nucleotide (a start site) and more than one ending nucleotide (a stop site) along the DNA, each nucleotide sequence catalyzed that ultimately produces approximately the same RNA is part of a gene. The catalysis of each RNA representation from the template DNA is a transcription, specifically a gene transcription. The overall process is also referred to as gene transcription.



Def. a "section of DNA that controls the initiation of RNA transcription as a product of a gene"[3] is called a promoter.

Proximal promoters


Def a section of promoter DNA which includes the transcription start sites that is neighboring the start sites is called a proximal promoter.



Def. a central or most important part of something is called a core.

Theoretical core promoters


Def. "the factors, including RNA polymerase II itself, that are minimally essential for transcription in vitro from an isolated core promoter" is called the basal machinery, or basal transcription machinery.[4]

Def. one or more sequence motifs containing the transcription start sites (TSSs), juxtaposed to the motif containing the TSSs, or in the proximal promoter that are only found in this core of motifs is called a core promoter.

Metal responsive elements


A metal responsive element (MRE), or TGC box, may occur in the core promoter of some human DNA genes.

"The metallothionein (MT) genes provide a good example of eucaryotic promoter architecture. MT genes specify the synthesis of low-molecular-weight metal-binding proteins. They are transcriptionally regulated by the metal ions cadmium and zinc (11), glucocorticoid hormones (18), interferon (14), interleukin-1 (22), and tumor promoters (2). The metal ion regulation of MTs is conferred by a short sequence element called the metal-responsive element (MRE [21]) or TGC box (31, 34), which functions as a metal ion-dependent enhancer."[5]

GC boxes


Def. a "sequence of contiguous guanine, guanine, guanine, cytosine, and guanine, in that order, along a DNA strand"[6] is called a GC box.

"[A] GC box is a distinct pattern of nucleotides found in the promoter region of some eukaryotic genes upstream of the TATA box and approximately 110 bases upstream from the transcription initiation site. It has a consensus sequence GGGCGG which is position dependent and orientation independent. The GC elements are bound by transcription factors and have similar functions to enhancers.[7]"[8]

"A large subclass of polymerase II promoters lacks both TATAA and CCAAT sequence motifs but contains multiple GC boxes. This promoter class includes several housekeeping genes (e.g., the genes encoding dihydrofolate reductase [DHFR] ..., hydroxymethylglutaryl coenzyme A reductase [39], hypoxanthine guanine phosphoribosyltransferase [33], and adenosine deaminase [46]) [and] nonhousekeeping genes (e.g., the transforming growth factor alpha [9, 23], rat malic enzyme [36], human c-Ha-ras [21], epidermal growth factor receptor [22], and nerve growth factor receptor [42] genes)."[9]

"[A] GC box-binding factor is required for transcription and ... a truncated promoter containing one GC box is transcriptionally inactive (44). ... the DNA-protein interactions occurring at the GC boxes in the DHFR promoter are functionally distinct and that factors binding to the GC boxes must interact in a position-dependent manner."[9]

"In promoters containing multiple GC boxes but lacking the TATAA box, transcription start sites may be single and specific, as observed in the nerve growth factor receptor gene (42) and the cellular retinol-binding protein gene (37), or there may be multiple heterogeneous start sites, such as those found in the c-myb (4), insulin receptor (45), and Ha-ras (21) genes. ... GC boxes are responsible for directing transcription from the major and the minor start sites. ... All TATAA-less promoters have at least two GC boxes"[9].

"A GC box sequence, one of the most common regulatory DNA elements of eukaryotic genes, is recognized by the Spl transcription factor; its consensus sequence is represented as 5'-G/T G/A GGCG G/T G/A G/A C/T-3' [or 5′-KRGGCGKRRY-3′] (Briggs et al., 1986)."[10]

HY boxes


A core responsive element is the hypertrophy region HY box between -89 and -60 nucleotides (nts) upstream from the transcription start site.[11]

CAAT boxes


"[A] CCAAT box (also sometimes abbreviated a CAAT box or CAT box) is a distinct pattern of nucleotides with GGCCAATCT consensus sequence that occur upstream by 75-80 bases to the initial transcription site. The CAAT box signals the binding site for the RNA transcription factor, and is typically accompanied by a conserved consensus sequence. It is an invariant DNA sequence at about minus 70 base pairs from the origin of transcription in many eukaryotic promoters. Genes that have this element seem to require it for the gene to be transcribed in sufficient quantities. It is frequently absent from genes that encode proteins used in virtually all cells. This box along with the GC box is known for binding general transcription factors. CAAT and GC are primarily located in the region from 100-150bp upstream from the TATA box. Both of these consensus sequences belong to the regulatory promoter. Full gene expression occurs when transcription activator proteins bind to each module within the regulatory promoter. Protein specific binding is required for the CCAAT box activation. These proteins are known as CCAAT box binding proteins/CCAAT box binding factors. A CCAAT box is a feature frequently found before eukaryote coding regions".[12]

B recognition elements


"The B recognition element (BRE) is a DNA sequence found in the promoter region of most genes in eukaryotes and Archaea.[13][14] The BRE is a cis-regulatory element that is found immediately upstream of the TATA box, and consists of 7 nucleotides."[15]

"The Transcription Factor IIB (TFIIB) recognizes this sequence in the DNA, and binds to it. The fourth and fifth alpha helices of TFIIB intercalate with the major groove of the DNA at the BRE. TFIIB is one part of the preinitiation complex that helps RNA Polymerase II bind to the DNA."[15]

The consensus sequence is 5’-G/C G/C G/A C G C C-3’.[16]

The general consensus sequence using degenerate nucleotides is 5’-SSRCGCC-3’, where S = G or C and R = A or G.[17]

"The position in nucleotides (nt) relative to the transcription start site (TSS, +1)" is -35 for the BRE. Of human promoters, some "22-25% [are] BRE containing promoters ... the functional consensus sequences for BRE ... motif [is] still poorly defined."[17]

EIF4E basal elements


The EIF4E basal element, also eIF4E, (4EBE) is a basal promoter element for the eukaryotic translation initiation factor 4E. "Interactions between 4EBE and upstream activator sites are position, distance, and sequence dependent."[18]

TATA boxes


Def. a "DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes"[19] is called a TATA box.

The TATA box can be an AT-rich sequence "located at a fixed distance upstream of the transcription start site"[4].

TBP-like factors


Notation: let the symbol TLF designate a TATA binding protein-like factor.

The human gene TBPL1 (TBP-like 1, also TLF and TRF2[4]), GeneID: 9519, encodes a protein that "does not bind to the TATA box and initiates transcription from TATA-less promoters."[20]

Downstream TFIIB recognition


The downstream TFIIB recognition element (dBRE) has a consensus sequence in the transcription direction on the template strand of 3'-RTDKKKK-5', using degenerate nucleotides, or 3'-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5'.[21]

dBRE is cis-TATA box, between the TATA box and the Inr or transcription start site (TSS) and trans-TSS.[21]

Initiator elements


For RNA polymerase II holoenzyme to transcribe a gene, the gene's promoter must be located. After the promoter is located, the transcription start site (TSS) is pinpointed by using nucleotide sequences that include the TSS or perhaps allow distance measurement to the TSS. Within the promoter, most human genes lack a TATA box and have an initiator element (Inr) or downstream promoter element instead.

"RNA pol II itself recognizes features of the Inr which might assist the correct positioning of the polymerase on the promoter (Carcamo et al., 1991; Weis and Reinberg, 1997)."[1][22][23]

Transcription start sites


The transcription start site (TSS) is the location on the DNA template strand where transcription begins at the 3'-end of a gene.[24] This location corresponds to the 5'-end of the mRNA which by convention is used to designate DNA locations.[24] For example, the 5'-TATA-box-3' designation refers to the directionality of the mRNA and corresponds to the 3'-TATA-box-5' designation for nucleotides on the template strand.[24] The template strand is the DNA strand being transcribed by RNA polymerase.[24]

Downstream core elements


"[N]onredundant human promoter sequences 600 bp long (−499 to +100 bp around the TSS) [are available] from [the] Eukaryotic Promoter Database (EPD) release 75 (4, 68) (, and ... promoters sequences 1,200 bp long (−1,000 to +200 bp) [are available] from the Database of Transcriptional Start Sites (DBTSS) (59, 74, 75) ("[25].

The downstream core element (DCE) is a transcription core promoter sequence that is within the transcribed portion of a gene.

The consensus sequence for the DCE is CTTC...CTGT...AGC.[25] These three consensus elements are referred to as subelements: "SI is CTTC, SII is CTGT, and SIII is AGC."[25]

The number of nucleotides between each subelement can apparently vary down to none.

A core promoter that contains all three subelements may be much less common than one containing only one or two.[25] "SI resides approximately from +6 to +11, SII from +16 to +21, and SIII from +30 to +34."[25]

SI as 3'-CTTC-5' can occur as 3 of 4 (CTT, TTC) or 4 of 4 (CTTC). SII as 3'-CTGT-5' can also occur as 3 of 4 (CTG, TGT) or 4 of 4 (CTGT). SIII as AGC is not known to vary.

DCE SIII can function independently of SI and SII.[25]

Transcription factor II D (TFIID), a transcription factor that is part of the RNA polymerase II holoenzyme, interacts with promoters containing only SIII of the DCE suggesting a critical spacing parameter between SIII and the TATA box, initiator element, or some combination of the two.[25] TFIID probably serves as a core promoter recognition complex.[25]

TAF1 interacts with the DCE in a sequence-dependent manner.[25]

The differences between core promoters with downstream elements may be explained by

  1. "TATA- and DPE-dependent promoters are specific for particular enhancers"[25],
  2. "preferences of activators for specific core promoter architectures"[25], and
  3. "the presence of a DCE or [downstream core promoter element (DPE)] might be indicative of an architecture designed for specific regulatory networks, such as the regulation of housekeeping promoters versus tissue-specific promoters (or other highly regulated promoters) or the regulation of subsets of viral promoters."[25]

Motif ten elements


The motif ten element (MTE) is a downstream core promoter element that "promotes transcription by RNA polymerase II when it is located precisely at positions +18 to +27 relative to A+1 in the initiator (Inr) element."[26]

The motif 10 consensus sequence is CSARCSSAACGS [5'-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-3'].[26] By convention, the consensus sequence 5'-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-3' is stated as it would be translated into mRNA. In the direction of transcription on the template strand this consensus sequence becomes 3'-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-5'.

Downstream promoter elements


"The downstream promoter element (DPE) is a core promoter element ... present in other species including humans and excluding Saccharomyces cerevisiae.[27] Like all core promoters, the DPE plays an important role in the initiation of gene transcription by RNA polymerase II."[28]

The core sequence of the DPE is located precisely +28 to +32 nts relative to the A+1 nt in the Inr.[16]


  1. Each portion of a DNA that becomes active has a core promoter.
  2. The "minimal portion of the promoter required to properly initiate transcription".[29]

See also



  1. 1.0 1.1 Gillian E. Chalkley; C. Peter Verrijzer (September 1, 1999). "DNA binding site selection by RNA polymerase II TAFs: a TAFII250-TAFII150 complex recognizes the Initiator". The EMBO Journal 18 (17): 4835-45. PMID 10469661. Retrieved 2012-04-26. 
  2. S. T. Smale (1997). "Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes". Biochim. Biophys. Acta. 1351: 73-88. 
  3. Ceyockey (28 January 2005). promoter. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2012-09-29. 
  4. 4.0 4.1 4.2 Stephen T. Smale; James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter". Annual Review of Biochemistry 72 (1): 449-79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. Retrieved 2012-05-07. 
  5. Robert D. Andersen; Susan J. Taplitz; Sandy Wong; Greg Bristol; Bill Larkin; Harvey R. Herschman (October 1987). "Metal-Dependent Binding of a Factor In Vivo to the Metal-Responsive Elements of the Metallothionein 1 Gene Promoter". Molecular and Cellular Biology 7 (10): 3574-81. doi:10.1128/​MCB.7.10.3574. Retrieved 2013-04-15. 
  6. Msh210 (23 February 2010). "GC box". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2013-01-27. {{cite web}}: |author= has generic name (help)
  7. Klug WS; Cummings MR; Spencer CA; Palladina MA (2009). Concepts of Genetics: Ninth Edition. San Francisco: Pearson Benjamin Cummings. pp. 463–464. ISBN 978-0-321-54098-0. 
  8. "GC box, In: Wikipedia". San Francisco, California: Wikimedia Foundation, Inc. June 23, 2012. Retrieved 2013-01-27.
  9. 9.0 9.1 9.2 Michael C. Blake; Robert C. Jambou; Andrew G. Swick; Jeanne W. Kahn; Jane Clifford Azizkhan (December 1990). "Transcriptional Initiation Is Controlled by Upstream GC-Box Interactions in a TATAA-Less Promoter". Molecular and Cellular Biology 10 (12): 6632-41. doi:10.1128/​MCB.10.12.6632. PMID 2247077. Retrieved 2013-01-27. 
  10. H Imataka; K Sogawa; KI Yasumoto; Y Kikuchi; K Sasano; A Kobayashi; M Hayami; Y Fujii-Kuriyama (October 1992). "Two regulatory proteins that bind to the basic transcription element (BTE), a GC box sequence in the promoter region of the rat P-4501A1 gene". The EMBO Journal 11 (10): 3663-71. PMID 1356762. Retrieved 2013-01-27. 
  11. Akiro Higashikawa; Taku Saito; Toshiyuki Ikeda; Satoru Kamekura; Naohiro Kawamura; Akinori Kan; Yasushi Oshima; Shinsuke Ohba et al. (January 2009). "Identification of the core element responsive to runt-related transcription factor 2 in the promoter of human type x collagen gene". Arthritis & Rheumatism 60 (1): 166-78. doi:10.1002/art.24243. PMID 19116917. Retrieved 2013-06-18. 
  12. "CAAT box". San Francisco, California: Wikimedia Foundation, Inc. April 8, 2013. Retrieved 2013-04-14.
  13. Lagrange T; Kapanidis AN; Tang H; Reinberg D; Ebright RH (1998). "New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB". Genes & Development 12 (1): 34–44. doi:10.1101/gad.12.1.34. PMID 9420329. PMC 316406. // 
  14. Littlefield O; Korkhin Y; Sigler PB (1999). "The structural basis for the oriented assembly of a TBP/TFB/promoter complex". Proceedings of the National Academy of Sciences of the USA 96 (24): 13668–73. doi:10.1073/pnas.96.24.13668. PMID 10570130. PMC 24122. // 
  15. 15.0 15.1 "B recognition element, In: Wikipedia". San Francisco, California: Wikimedia Foundation, Inc. January 30, 2013. Retrieved 2013-01-30.
  16. 16.0 16.1 Alan K. Kutach; James T. Kadonaga (July 2000). "The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters". Molecular and Cellular Biology 20 (13): 4754-64. PMID 10848601. Retrieved 2012-07-15. 
  17. 17.0 17.1 Chuhu Yang; Eugene Bolotin; Tao Jiang; Frances M. Sladek; Ernest Martinez. (March 7, 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene 389 (1): 52-65. doi:10.1016/j.gene.2006.09.029. PMID 17123746. 
  18. Mary Lynch; Li Chen; Michael J. Ravitz; Sapna Mehtani; Kevin Korenblat; Michael J. Pazin; Emmett V. Schmidt (August 2005). "hnRNP K Binds a Core Polypyrimidine Element in the Eukaryotic Translation Initiation Factor 4E (eIF4E) Promoter, and Its Regulation of eIF4E Contributes to Neoplastic Transformation". Molecular and Cellular Biology 25 (15): 6436-53. doi:10.1128/​MCB.25.15.6436-6453.2005. Retrieved 2013-03-17. 
  19. "TATA box, In: Wiktionary". San Francisco, California: Wikimedia Foundation, Inc. June 17, 2013. Retrieved 2014-05-07.
  20. National Center for Biotechnology Information (April 28, 2012). "TBPL1 TBP-like 1 [ Homo sapiens ]". Bethesda, MD, USA: U.S. National Library of Medicine. Retrieved 2012-04-30.
  21. 21.0 21.1 Wensheng Deng; Stefan G.E. Roberts (October 15, 2005). "A core promoter element downstream of the TATA box that is recognized by TFIIB". Genes & Development 19 (20): 2418–23. doi:10.1101/gad.342405. PMID 16230532. 
  22. J. Carcamo; L. Buckbinder; D. Reinberg (1991). "The initiator directs the assembly of a transcription factor IID-dependent transcription complex". Proc. Natl. Acad. Sci, USA 88: 8052-6. 
  23. L. Weis; D. Reinberg (1997). "Accurate positioning of RNA polymerase II on a natural TATA-less promoter is independent of TATA-binding protein associated factors and initiator-binding proteins". Mol. Cell. Biol. 17: 2973-84. 
  24. 24.0 24.1 24.2 24.3 Marketa J. Zvelebil; Jeremy O. Baum (2008). Dom Holdsworth. ed. Understanding bioinformatics. New York: Garland Science. pp. 772. ISBN 978-0815340249. 
  25. 25.00 25.01 25.02 25.03 25.04 25.05 25.06 25.07 25.08 25.09 25.10 25.11 Dong-Hoon Lee; Naum Gershenzon; Malavika Gupta; Ilya P. Ioshikhes; Danny Reinberg; Brian A. Lewis (November 2005). "Functional Characterization of Core Promoter Elements: the Downstream Core Element Is Recognized by TAF1". Molecular and Cellular Biology 25 (21): 9674-86. doi:10.1128/MCB.25.21.9674-9686.2005. PMID 16227614. Retrieved 2010-10-23. 
  26. 26.0 26.1 Chin Yan Lim; Buyung Santoso; Thomas Boulay; Emily Dong; Uwe Ohler; James T. Kadonaga (July 1, 2004). "The MTE, a new core promoter element for transcription by RNA polymerase II". Genes & Development 18 (13): 1606-17. doi:10.1101/gad.1193404. PMID 15231738. Retrieved 2013-02-10. 
  27. Tamar Juven-Gershon; James T. Kadonaga (March 15, 2010). "Regulation of Gene Expression via the Core Promoter and the Basal Transcriptional Machinery". Developmental Biology 339 (2): 225–9. doi:10.1016/j.ydbio.2009.08.009. PMID 19682982. PMC 2830304. // 
  28. "Downstream promoter element". San Francisco, California: Wikimedia Foundation, Inc. May 6, 2012. Retrieved 2012-05-20.
  29. Cquan (2 October 2006). "Promoter (genetics)". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-01-09. {{cite web}}: |author= has generic name (help)

Further reading


{{Phosphate biochemistry}}{{Terminology resources}}