In the biosynthesis of any human protein, the gene that contains the nucleotide sequence which is translated into that protein must be transcribed. For RNA polymerase II holoenzyme to transcribe the gene, the gene's promoter must be located. After the promoter is located, the transcription start site (TSS) is pinpointed by using nucleotide sequences that include the TSS. Within the promoter, most human genes lack a TATA box and have an initiator element (Inr) or downstream promoter element instead.

On the basis of descriptions available, various Inrs are located to test whether the known TSS is located.

Notations

edit

Notation: let the symbol Inr denote an initiator element.

Notation: let the symbol +1 designate the nucleotide that is the transcription start site (TSS).

Genetics

edit

Inr in humans was first explained and sequenced in 1989.[1]

The Inr element for core promoters was found to be more prevalent than the TATA box in eukaryotic promoter domains.[2] In a study of 1800+ distinct human promoter sequences it was found that 49% contain the Inr element while 21.8% contain the TATA box.[2]

Gene transcriptions

edit

Two subunits, TAF1 and TAF2, of the TFIID recognize the Inr sequence and bring the complex together.[3]

The interaction between TFIID and Inr is believed to be most imperative in initiating transcription due to the Inr sequence overlapping the start site.[4]

The Inr element is also believed to interact with the activator Sp1 transcription factor (Sp1), specificity protein 1 transcription factor, which is then able to regulate the activation and initiation of transcription[5]

Promoters with a functional Inr are more likely to lack a TATA box or to possess a degenerate TATA sequence because a gene with an active Inr is less dependent on a functional TATA box or additional promoters.[6] Although Inr element varies between promoters, the sequence is highly conserved between humans and yeast.[6] An analysis of 7670 transcription start sites showed that roughly 40% had an exact match to the BBCA+1BW Inr sequence, while 16% contained only one mismatch [7] TFIID and subunits are very sensitive to the Inr sequence and nucleotide changes have been shown to drastically change the binding affinity, where the +1 and -3 positions have been identified as the most critical for transcription efficiency and Inr function.[6] A replacement of the Adenosine (A) nucleotide at the +1 to G or T changes transcription activity by 10% and a replacement of Thymine (T) at the +3 position changes transcription activity levels by 22%.[8]

Theoretical initiator elements

edit

Here's a theoretical definition:

Def. a series of nucleotides including a transcription start site on one DNA strand whose presence in a gene promoter eventually leads to a chain reaction or polymerization such as transcription is called an initiator element.

RNA polymerase IIs

edit

"RNA pol II itself recognizes features of the Inr which might assist the correct positioning of the polymerase on the promoter (Carcamo et al., 1991; Weis and Reinberg, 1997)."[9][10][11]

RNA polymerase II may form a stable complex on TATA-less promoters that contain Inr elements and possess a weak, intrinsic preference for Inr-like sequences.[10]

RNA polymerase II holoenzyme complexes

edit

Gene ID: 672 is BRCA1 BRCA1, DNA repair associated. "This gene encodes a nuclear phosphoprotein that plays a role in maintaining genomic stability, and it also acts as a tumor suppressor. The encoded protein combines with other tumor suppressors, DNA damage sensors, and signal transducers to form a large multi-subunit protein complex known as the BRCA1-associated genome surveillance complex (BASC). This gene product associates with RNA polymerase II, and through the C-terminal domain, also interacts with histone deacetylase complexes. This protein thus plays a role in transcription, DNA repair of double-stranded breaks, and recombination. Mutations in this gene are responsible for approximately 40% of inherited breast cancers and more than 80% of inherited breast and ovarian cancers. Alternative splicing plays a role in modulating the subcellular localization and physiological function of this gene. Many alternatively spliced transcript variants, some of which are disease-associated mutations, have been described for this gene, but the full-length natures of only some of these variants has been described. A related pseudogene, which is also located on chromosome 17, has been identified."[12]

Gene ID: 1660 is DHX9 DExH-box helicase 9 (aka LKP; RHA; DDX9; NDH2; NDHII). "This gene encodes a member of the DEAH-containing family of RNA helicases. The encoded protein is an enzyme that catalyzes the ATP-dependent unwinding of double-stranded RNA and DNA-RNA complexes. This protein localizes to both the nucleus and the cytoplasm and functions as a transcriptional regulator. This protein may also be involved in the expression and nuclear export of retroviral RNAs. Alternate splicing results in multiple transcript variants. Pseudogenes of this gene are found on chromosomes 11 and 13."[13]

BRCA1 has been shown to interact with DHX9; i.e., overexpression of a protein fragment of RNA helicase A causes inhibition of endogenous BRCA1 function and defects in ploidy and cytokinesis in mammary epithelial cells[14] and the BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase A.[15]

ATP-dependent RNA helicase A (RHA; also known as DHX9, LKP, and NDHI) is an enzyme that in humans is encoded by the DHX9 gene.[16][17][13]

RNA polymerase II subunit A C-terminal domain phosphatase is an enzyme that in humans is encoded by the CTDP1 gene.[18][19][20]

Gene ID: 9150 is CTDP1 CTD phosphatase subunit 1. "This gene encodes a protein which interacts with the carboxy-terminus of the RAP74 subunit of transcription initiation factor TFIIF, and functions as a phosphatase that processively dephosphorylates the C-terminus of POLR2A (a subunit of RNA polymerase II), making it available for initiation of gene expression. Mutations in this gene are associated with congenital cataracts, facial dysmorphism and neuropathy syndrome (CCFDN). Alternatively spliced transcript variants encoding different isoforms have been described for this gene."[21]

"This gene encodes a protein which interacts with the carboxy-terminus of transcription initiation factor TFIIF, a transcription factor which regulates elongation as well as initiation by RNA polymerase II. The protein may also represent a component of an RNA polymerase II holoenzyme complex. Alternative splicing of this gene results in two transcript variants encoding 2 different isoforms."[20]

CTDP1 has been shown to interact with WD repeat-containing protein 77,[22] GTF2F1[19] and POLR2A.[23]

Gene ID: 168400 is DDX53 DEAD-box helicase 53. "This intronless gene encodes a protein which contains several domains found in members of the DEAD-box helicase protein family. Other members of this protein family participate in ATP-dependent RNA unwinding."[24]

"DEAD/DEAH box helicases are proteins, and are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein with RNA helicase activity. It may participate in melting of DNA:RNA hybrids, such as those that occur during transcription, and may play a role in X-linked gene expression. It contains 2 copies of a double-stranded RNA-binding domain, a DEXH core domain and an RGG box. The RNA-binding domains and RGG box influence and regulate RNA helicase activity."[24]

Consensus sequences

edit

As in other metazoans, for genes lacking a TATA box, the Inr is functionally analogous, with a base pair (bp) consensus 5'-YYA+1NWYY-3', to direct transcription initiation.[25] Using the degenerate nucleotide code, the consensus sequence is 5'-C/T-C/T-A-A/C/G/T-A/T-C/T-C/T-3', or in the direction of transcription on the template strand: 3'-C/T-C/T-A-A/C/G/T-A/T-C/T-C/T-5'.

"TATA-less core promoters that lack AT-rich sequences in the -30 region and do not stably bind TBP are likely to assemble PICs via alternative pathways and to be regulated by distinct mechanisms (Smale and Kadonaga, 2003). However, the number of such bona fide TATA-less genes remains unclear in eukaryotic genomes."[26]

In Entamoeba histolytica, the consensus sequence is AAAAATTCA.[27]

The Inr has the consensus sequence YYANWYY.[28] Similarly to the TATA box, the Inr element facilitates the binding of transcription Factor II D (TATA binding protein TAF).[28]

Enhancers

edit

An Inr for mammalian RNA polymerase II can be defined as a DNA sequence element that overlaps a TSS and is sufficient for

  1. determining the start site location in a promoter that lacks a TATA box and
  2. enhancing the strength of a promoter that contains a TATA box.[29]

TATA binding protein associated factors

edit

"Although any isolated TAF may not exhibit sequence-specific interactions at the Inr element in the absence of a TATA-box, a combination of TAFs may bind sequence specifically to the Inr element regardless of the TATA-box and/or DPE (Chalkley and Verrijzer, 1999)."[30] Bold added.

TAF1 "binds to core promoter sequences encompassing the transcription start site. It also binds to activators and other transcriptional regulators, and these interactions affect the rate of transcription initiation."[31]

Prior to transcription, stable binding to an Inr occurs by a complex consisting of TAF1 and TAF2.[9]

TATA box-likes

edit

The Inr is the only element in metazoan protein-encoding genes known to be a functional analog of the TATA box, in that it is sufficient for directing accurate transcription initiation in genes that lack TATA boxes.[32]

General transcription factor II As

edit

General transcription factor II A is critical for the cooperative binding of TFIID to the Inr.[33]

General transcription factor II Ds

edit

The general transcription factor II D (TFIID) is one of several general transcription factors that make up the RNA polymerase II preinitiation complex.[34] Before the start of transcription, the transcription factor II D (TFIID) complex, binds to the core promoter of the gene.[34]

TFIID is the first protein to bind to DNA during the formation of the pre-initiation transcription complex of RNA polymerase II (RNA Pol II).[34]

General transcription factor II Is

edit

General transcription factor II I, or TFII-I, is a factor capable of binding the Inr element.[35][36]

Transcription start sites

edit

Usually the Inr contains the TSS.

"[T]he initiator (INR) element located at, or immediately adjacent to, the TSS, ... is recognized by the TBP-associated factors TAF1 and TAF2 of the TFIID complex".[26]

"[T]ranscription does not need to begin at the +1 nucleotide for the Inr to function. RNA polymerase II has been redirected to alternative start sites by reducing ATP concentrations within a nuclear extract, by altering the spacing between the TATA and Inr in a promoter containing both elements, and by dinucleotide initiation strategies".[37]

Hypotheses

edit
  1. A1BG is not transcribed by an initiator element.
  2. A1BG is not transcribed by a TATA box.

See also

edit

References

edit
  1. Smale, Stephen T.; Baltimore, David (1989-04-07). "The "initiator" as a transcription control element". Cell 57 (1): 103–113. doi:10.1016/0092-8674(89)90176-1. ISSN 0092-8674. PMID 2467742. http://www.cell.com/cell/abstract/0092-8674(89)90176-1. 
  2. 2.0 2.1 Gershenzon, Naum I.; Ioshikhes, Ilya P. (2005-04-15). "Synergy of human Pol II core promoter elements revealed by statistical sequence analysis". Bioinformatics 21 (8): 1295–1300. doi:10.1093/bioinformatics/bti172. ISSN 1367-4803. https://academic.oup.com/bioinformatics/article/21/8/1295/249581/Synergy-of-human-Pol-II-core-promoter-elements. 
  3. Lim, Chin Yan; Santoso, Buyung; Boulay, Thomas; Dong, Emily; Ohler, Uwe; Kadonaga, James T. (2004-07-01). "The MTE, a new core promoter element for transcription by RNA polymerase II". Genes & Development 18 (13): 1606–1617. doi:10.1101/gad.1193404. ISSN 0890-9369. PMID 15231738. PMC 443522. //www.ncbi.nlm.nih.gov/pmc/articles/PMC443522/. 
  4. Kaufmann, J.; Smale, S. T. (1994-04-01). "Direct recognition of initiator elements by a component of the transcription factor IID complex.". Genes & Development 8 (7): 821–829. doi:10.1101/gad.8.7.821. ISSN 0890-9369. PMID 7926770. http://genesdev.cshlp.org/content/8/7/821. 
  5. O'Shea-Greenfield, A.; Smale, S. T. (1992-01-15). "Roles of TATA and initiator elements in determining the start site location and direction of RNA polymerase II transcription". The Journal of Biological Chemistry 267 (2): 1391–1402. ISSN 0021-9258. PMID 1730658. 
  6. 6.0 6.1 6.2 Yang, Chuhu; Bolotin, Eugene; Jiang, Tao; Sladek, Frances M.; Martinez, Ernest (2007-03-01). "Prevalence of the Initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. ISSN 0378-1119. PMID 17123746. PMC 1955227. //www.ncbi.nlm.nih.gov/pmc/articles/PMC1955227/. 
  7. Ngoc, Long Vo; Cassidy, California Jack; Huang, Cassidy Yunjing; Duttke, Sascha H. C.; Kadonaga, James T. (2017-01-20). "The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters". Genes & Development. doi:10.1101/gad.293837.116. ISSN 0890-9369. PMID 28108474. PMC 5287114. http://genesdev.cshlp.org/content/early/2017/01/20/gad.293837.116. 
  8. Javahery, R; Khachi, A; Lo, K; Zenzie-Gregory, B; Smale, S T (1994-01-01). "DNA sequence requirements for transcriptional initiator activity in mammalian cells.". Molecular and Cellular Biology 14 (1): 116–127. doi:10.1128/mcb.14.1.116. ISSN 0270-7306. PMID 8264580. PMC 358362. //www.ncbi.nlm.nih.gov/pmc/articles/PMC358362/. 
  9. 9.0 9.1 Gillian E. Chalkley and C. Peter Verrijzer (September 1, 1999). "DNA binding site selection by RNA polymerase II TAFs: a TAFII250-TAFII150 complex recognizes the Initiator". The EMBO Journal 18 (17): 4835-45. PMID 10469661. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1171555/pdf/004835.pdf. Retrieved 2012-04-26. 
  10. 10.0 10.1 J. Carcamo, L. Buckbinder, D. Reinberg (1991). Proceedings of the National Academy of Sciences USA 88: 8052-6. 
  11. L. Weis and D. Reinberg (1997). "Accurate positioning of RNA polymerase II on a natural TATA-less promoter is independent of TATA-binding protein associated factors and initiator-binding proteins". Mol. Cell. Biol. 17: 2973-84. 
  12. RefSeq (May 2009). BRCA1 BRCA1, DNA repair associated ( Homo sapiens (human) ). Bethesda, MD, USA: National Center for Biotechnology Information, U.S. National Library of Medicine. https://www.ncbi.nlm.nih.gov/gene/672. Retrieved 22 December 2018. 
  13. 13.0 13.1 RefSeq (February 2010). BRCA1 BRCA1, DNA repair associated ( Homo sapiens (human) ). Bethesda, MD, USA: National Center for Biotechnology Information, U.S. National Library of Medicine. https://www.ncbi.nlm.nih.gov/gene/1660. Retrieved 22 December 2018. 
  14. "Overexpression of a protein fragment of RNA helicase A causes inhibition of endogenous BRCA1 function and defects in ploidy and cytokinesis in mammary epithelial cells". Oncogene 22 (7): 983–91. February 2003. doi:10.1038/sj.onc.1206195. PMID 12592385. 
  15. "BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase A". Nat. Genet. 19 (3): 254–6. July 1998. doi:10.1038/930. PMID 9662397. 
  16. "Human RNA helicase A is homologous to the maleless protein of Drosophila". The Journal of Biological Chemistry 268 (22): 16822–30. Aug 1993. PMID 8344961. 
  17. "Domain structure of human nuclear DNA helicase II (RNA helicase A)". The Journal of Biological Chemistry 272 (17): 11487–94. Apr 1997. doi:10.1074/jbc.272.17.11487. PMID 9111062. 
  18. "An essential component of a C-terminal domain phosphatase that interacts with transcription factor IIF in Saccharomyces cerevisiae". Proc Natl Acad Sci U S A 94 (26): 14300–5. Feb 1998. doi:10.1073/pnas.94.26.14300. PMID 9405607. PMC 24951. //www.ncbi.nlm.nih.gov/pmc/articles/PMC24951/. 
  19. 19.0 19.1 "FCP1, the RAP74-interacting subunit of a human protein phosphatase that dephosphorylates the carboxyl-terminal domain of RNA polymerase IIO". J Biol Chem 273 (42): 27593–601. Nov 1998. doi:10.1074/jbc.273.42.27593. PMID 9765293. 
  20. 20.0 20.1 Entrez Gene: CTDP1 CTD (carboxy-terminal domain, RNA polymerase II, polypeptide A) phosphatase, subunit 1. https://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=ShowDetailView&TermToSearch=9150. 
  21. RefSeq (February 2011). CTDP1 CTD phosphatase subunit 1 ( Homo sapiens (human) ). Bethesda, MD, USA: National Center for Biotechnology Information, U.S. National Library of Medicine. https://www.ncbi.nlm.nih.gov/gene/9150. Retrieved 22 December 2018. 
  22. Licciardo, Paolo; Amente Stefano; Ruggiero Luca; Monti Maria; Pucci Piero; Lania Luigi; Majello Barbara (Feb 2003). "The FCP1 phosphatase interacts with RNA polymerase II and with MEP50 a component of the methylosome complex involved in the assembly of snRNP". Nucleic Acids Res. (England) 31 (3): 999–1005. doi:10.1093/nar/gkg197. PMID 12560496. PMC 149217. //www.ncbi.nlm.nih.gov/pmc/articles/PMC149217/. 
  23. Scully, R; Anderson S F; Chao D M; Wei W; Ye L; Young R A; Livingston D M; Parvin J D (May 1997). "BRCA1 is a component of the RNA polymerase II holoenzyme". Proc. Natl. Acad. Sci. U.S.A. (UNITED STATES) 94 (11): 5605–10. doi:10.1073/pnas.94.11.5605. ISSN 0027-8424. PMID 9159119. PMC 20825. //www.ncbi.nlm.nih.gov/pmc/articles/PMC20825/. 
  24. 24.0 24.1 RefSeq (September 2011). DDX53 DEAD-box helicase 53 [ Homo sapiens (human). Bethesda, MD, USA: National Center for Biotechnology Information, U.S. National Library of Medicine. https://www.ncbi.nlm.nih.gov/gene/168400. Retrieved 22 December 2018. 
  25. DR Liston, PJ Johnson (March 1999). "Analysis of a Ubiquitous Promoter Element in a Primitive Eukaryote: Early Evolution of the Initiator Element". Molecular and Cellular Biology 19 (3): 2380-8. PMID 10022924. 
  26. 26.0 26.1 C Yang, E Bolotin, T Jiang, FM Sladek, E Martinez (March 2007). "Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMID 17123746. PMC 1955227. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1955227/?tool=pubmed. 
  27. JE Purdy, BJ Mann, LT Pho, WA Petri Jr (July 19, 1994). "Transient transfection of the enteric parasite Entamoeba histolytica and expression of firefly luciferase". Proceedings of the National Academy of Science USA 91 (15): 7099-103. PMID 8041752. http://www.pnas.org/cgi/pmidlookup?view=long&pmid=8041752. Retrieved 2012-06-10. 
  28. 28.0 28.1 Hualin Xi, Yong Yu, Yutao Fu, Jonathan Foley, Anason Halees, and Zhiping Weng (June 2007). "Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1". Genome Research 17 (6): 798–806. doi:10.1101/gr.5754707. PMID 17567998. PMC 1891339. //www.ncbi.nlm.nih.gov/pmc/articles/PMC1891339/. 
  29. R. Javahery, A. Khachi, K. Lo, B. Zenzie-Gregory, S. T. Smale (January 1994). "DNA Sequence Requirements for Transcriptional Initiator Activity in Mammalian Cells". Molecular and Cellular Biology 14 (1): 116-27. PMID 8264580. 
  30. Ananda L. Roy (August 2001). "Biochemistry and biology of the inducible multifunctional transcription factor TFII-I". Gene 274 (1-2): 1-13. doi:10.1016/S0378-1119(01)00625-4. http://bioinformaticaupf.crg.cat/2006/projectes06/3.14/article2.pdf. Retrieved 2012-04-06. 
  31. HGNC:11535 (March 24, 2012). "TAF1 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 250kDa". Bethesda, Maryland: NCBI. Retrieved 2012-04-09.
  32. ST Smale (March 1997). "Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes". Biochimica & Biophysica Acta 1351 (1-2): 73-88. doi:10.1016/S0167-4781(96)00206-0. PMID 9116046. 
  33. KH Emami, A Jain, ST Smale (1997). Genes Development 11: 3007-19. 
  34. 34.0 34.1 34.2 Benjamin Lewin (2004). Genes VIII. Upper Saddle River, NJ: Pearson Prentice Hall. pp. 636–637. ISBN 0-13-144946-X. 
  35. AL Roy, M Meisterernst, P. Pognonec, RG Roeder (1991). Nature 354: 245-8. 
  36. AL Roy, S. Malik, M. Meisterernst, RG Roeder (1993). 365. pp. 355-9. 
  37. Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter". Annual Review of Biochemistry 72 (1): 449-79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. http://www.lps.ens.fr/~monasson/Houches/Kadonaga/CorePromoterAnnuRev2003.pdf. Retrieved 2012-05-07. 

Further reading

edit
edit

{{Chemistry resources}}

{{Phosphate biochemistry}}