Every protein your body has ever made — from haemoglobin carrying oxygen in your red cells to the enzyme catalysing the reaction you are studying right now — followed the same three-step production line. DNA stores the blueprint, transcription copies it into RNA, and translation reads that copy to build a protein. This flow of biological information is so fundamental that Francis Crick named it the central dogma of molecular biology in 1958, and understanding it in mechanistic depth is indispensable for any medical exam.
1 Overview: Why a Two-Step Intermediary?
Think of DNA as the master reference volume locked in the archive room (the nucleus). You would never take the original out — it is too precious, and too long. Instead, a photocopier (**RNA polymerase**) produces a working extract: a messenger RNA (mRNA) molecule containing only the chapter you need. This mRNA then travels out to the factory floor (the cytoplasm), where ribosomes decode it into a functional protein.
The dogma has a strict directionality: information flows from DNA to RNA to protein. Under normal circumstances it cannot run in reverse — a protein cannot dictate the sequence of a new RNA or DNA. The key word, however, is “normally.” Retroviruses such as HIV carry an enzyme called reverse transcriptase — an RNA-directed DNA polymerase — that allows the virus to write its RNA genome back into DNA for integration into the host chromosome. This is the most clinically important exception to the classical dogma, and it is precisely the target of antiretroviral drugs such as zidovudine (AZT).
| Process | Template | Product | Key Enzyme | Location (Eukaryote) |
|---|---|---|---|---|
| Replication | DNA (both strands) | New DNA double helix | DNA polymerase α / δ / ε | Nucleus |
| Transcription | DNA antisense strand | RNA (mRNA, tRNA, rRNA) | RNA polymerase I / II / III | Nucleus (+ processing) |
| Translation | mRNA | Polypeptide chain | Ribosome (peptidyl transferase) | Cytoplasm / RER |
| Reverse Transcription | RNA (viral) | cDNA → dsDNA | Reverse transcriptase | Cytoplasm (retroviruses) |
2 Step 1 — Transcription: Copying the Blueprint
Transcription is the process by which the nucleotide sequence of a gene is copied into a complementary RNA strand. RNA polymerase reads the antisense (template, −) strand of DNA in the 3′→5′ direction and synthesises an RNA chain in the 5′→3′ direction. The resulting RNA has the same sequence as the non-template (sense, coding, +) strand, with one crucial difference: thymine (T) is replaced by uracil (U). Crucially, RNA polymerase requires no primer — a key distinction from DNA polymerase.
Prokaryotes vs. Eukaryotes: In bacteria (e.g. E. coli), a single RNA polymerase handles all transcription. Its core enzyme (subunit composition α₂ββ′ω) can elongate RNA but cannot recognise a promoter on its own. It needs a sigma (σ) factor to locate and bind the promoter, forming the holoenzyme. Different sigma factors direct the polymerase to different gene sets — an elegant regulatory mechanism. Eukaryotes, by contrast, employ three distinct nuclear RNA polymerases, each with a specialised role.
“I Make Really Good Proteins”
I (Pol I): rRNA (large subunits — 28S, 18S, 5.8S). II (Pol II): mRNA (protein-coding) + most non-coding small RNAs. III (Pol III): tRNA, 5S rRNA, small structural RNAs.
Trick: Pol II makes the most medically important product — the mRNA. It is also the one α-amanitin blocks.
Phases of Transcription (using a eukaryotic protein-coding gene as the model):
Initiation — Promoter Recognition & Open Complex Formation
Promoter DNA + General TFs + RNA Pol II → Transcription Initiation Complex
RNA Pol II cannot bind a promoter unaided. In prokaryotes, the σ factor guides the holoenzyme to two conserved sequences: the −35 sequence (5′-TTGACA-3′) and the Pribnow box (5′-TATAAT-3′) at −10. In eukaryotes, general transcription factors (TFIIs) assemble on the TATA box (Hogness box), located ~25 bp upstream, and the CAAT box at ~−75. Assembly is ordered and sequential: TFIID anchors to the TATA box, acting as the platform on which the rest of the complex assembles; TFIIF escorts RNA Pol II to this pre-assembled scaffold; finally, TFIIH unwinds a short stretch of DNA using its helicase domain, converting the closed complex into an open one, and phosphorylates the carboxy-terminal domain (CTD) of Pol II — the chemical signal that launches the polymerase into productive elongation.
Elongation — RNA Chain Growth
NTP + growing RNA → RNA(n+1) + PPi [5′→3′ synthesis]
In prokaryotes, the σ factor dissociates and the core enzyme takes over, elongating the chain using ribonucleoside 5′-triphosphates as substrates. The DNA double helix unwinds ahead of the polymerase (forming a moving transcription bubble) and re-winds behind it. In eukaryotes, after CTD phosphorylation, Pol II enters productive elongation. Transcription and mRNA processing are tightly coupled — capping, for example, begins on the nascent transcript when it is barely 30 nucleotides long.
Termination — Release of the Transcript
mRNA + RNA Pol II → released transcript + dissociated polymerase
In prokaryotes, Rho-independent termination involves formation of a GC-rich hairpin in the transcript followed by a run of U residues, which destabilises the RNA-DNA hybrid and releases the transcript. Rho-dependent termination requires the Rho protein, which tracks along the mRNA and triggers release. In eukaryotes, termination by Pol II is coupled to polyadenylation: after the polyadenylation signal sequence (5′-AAUAAA-3′) is synthesised, the transcript is cleaved downstream and a poly(A) tail of ~200 adenine residues is added by poly(A) polymerase.
3 Step 2 — mRNA Processing in Eukaryotes
Prokaryotic mRNA is essentially ready to translate as it emerges from the polymerase — in fact, ribosomes often begin translation before transcription is even finished (coupled transcription-translation). Eukaryotic cells, however, transcribe a longer precursor called pre-mRNA that must be extensively modified in the nucleus before export to the cytoplasm.
Three major processing events transform a pre-mRNA into a mature, export-ready mRNA:
5′ Capping — Protection and Translation Signal
5′ end of pre-mRNA + GTP → 7-methylguanosine cap (m⁷G) via 5′–5′ triphosphate bond
This is the very first processing event — the cap is added co-transcriptionally when the nascent RNA is barely 30 nucleotides long. A guanosine residue is linked to the 5′ end via an unusual 5′–5′ triphosphate bridge (not the usual 3′–5′ phosphodiester), then methylated at the N-7 position by guanine-7-methyltransferase using S-adenosylmethionine as methyl donor. The cap serves two purposes: it protects the 5′ end from ribonuclease degradation, and it is the recognition signal for eukaryotic initiation factors during translation. Prokaryotic mRNA has no cap — this is why some antibiotics can target bacterial translation without affecting eukaryotic cells.
RNA Splicing — Removal of Introns
pre-mRNA (exons + introns) → mature mRNA (exons only) via spliceosome
Most eukaryotic genes are discontinuous: coding sequences (exons) are interrupted by non-coding sequences (introns). The pre-mRNA must be spliced to join the exons and excise the introns. Splice site consensus sequences mark the boundaries: introns almost invariably begin with GU (5′ splice site) and end with AG (3′ splice site), with a conserved branchpoint adenosine ~20–50 nucleotides upstream of the 3′ site. The spliceosome — a massive RNA-protein complex assembled from snRNPs (U1, U2, U4, U5, U6) — catalyses two transesterification reactions that release the intron as a branched “lariat” structure and ligate the flanking exons. Alternative splicing of the same pre-mRNA can generate multiple distinct proteins — the human genome (~20,000 genes) produces well over 100,000 distinct proteins largely through this mechanism.
3′ Polyadenylation — Stability and Export
pre-mRNA 3′ end + ~200 ATP → poly(A) tail (added by poly(A) polymerase)
The pre-mRNA is cleaved downstream of a conserved polyadenylation signal (5′-AAUAAA-3′), and poly(A) polymerase threads on a homopolymeric run of roughly 200 adenine residues to the freshly exposed 3′ end (the precise length varies by transcript and species, but falls in the 150–250 range). The poly(A) tail is not encoded in the DNA — it is added post-transcriptionally. It protects the mRNA from 3′ exonuclease degradation, facilitates export from the nucleus through nuclear pores, and enhances translational efficiency. Histones are a notable exception: their mRNAs lack a poly(A) tail but instead form a stem-loop structure for stability.
RNA splicing is catalysed by the spliceosome, which contains small nuclear ribonucleoprotein particles (snRNPs, or “snurps”). In systemic lupus erythematosus (SLE), patients produce autoantibodies against their own nuclear proteins — including snRNPs and other components of the spliceosome. The detection of anti-Smith (anti-Sm) antibodies, directed at core snRNP proteins, is highly specific for SLE and is one of the American College of Rheumatology diagnostic criteria. The disease is a direct consequence of the immune system attacking the very machinery of gene expression.
“Cap, Clip, Clip” — or “Cap the 5′, Clip the introns, Clip and poly-A the 3′”
Cap: 7-methylguanosine added to the 5′ end. Clip (introns): RNA splicing by the spliceosome removes introns, joins exons. Clip & Poly-A: 3′ end cleavage downstream of AAUAAA signal + poly(A) tail addition. All three must occur before export to the cytoplasm.
4 The Genetic Code: Translating Nucleotides into Amino Acids
Before examining translation mechanistically, you need a firm grasp of the genetic code — the dictionary that maps nucleotide triplets to amino acids. Each three-nucleotide word is a codon, always written 5′→3′. With four bases available, there are 4³ = 64 possible codons. These encode only 20 standard amino acids plus stop signals, which immediately tells you the code must be degenerate — most amino acids are specified by more than one codon.
| Property | Definition | Example / Notes |
|---|---|---|
| Specific (Unambiguous) | Each codon always specifies the same amino acid | AUG always = Met; never anything else |
| Degenerate (Redundant) | Each amino acid can be encoded by several different codons | Arginine has 6 synonymous codons; only Met & Trp are single-codon amino acids |
| Nonoverlapping & Commaless | Read continuously in triplets from a fixed start; no punctuation between codons | AUGUCGAAU = AUG/UCG/AAU (Met-Ser-Asn) |
| Universal | Code is virtually identical across all life forms | Exceptions: mitochondria (UGA = Trp, not Stop); some protozoa |
| Start codon | AUG — codes for Met (eukaryotes) or fMet (prokaryotes) | Also sets the reading frame for the entire message |
| Stop codons | UAA, UAG, UGA — no cognate tRNA; recognised by release factors | “U Are Away”, “U Are Gone”, “U Go Away” |
Degeneracy is not random — synonymous codons for the same amino acid often differ only in the third (3′) “wobble” position. The wobble hypothesis explains why there need not be 61 separate tRNA species: the first base of the anticodon (which pairs with the third base of the codon) has positional flexibility and can form non-Watson-Crick base pairs. A single tRNA can therefore service several codons at once, as long as the first two positions match — a highly economical solution that evolution has conserved across all life forms.
“Silent Sam Makes No Sound, Mistakes Make Patients Suffer, Nonsense Never Finishes”
Silent: altered codon → same amino acid (due to degeneracy; usually 3rd base change). Missense: altered codon → wrong amino acid (e.g. sickle cell disease: GAG→GUG, Glu→Val). Nonsense: altered codon → premature stop codon → truncated, usually non-functional protein.
5 Step 3 — Translation: Building the Protein
Translation is the decoding of the mRNA nucleotide sequence into a polypeptide chain. It requires three major molecular players — the ribosome (the factory), transfer RNAs (tRNAs) (the adaptors), and mRNA (the instruction tape) — plus a suite of protein factors and GTP as the energy currency.
The Ribosome: Ribosomes are large RNA–protein complexes consisting of a large and a small subunit. In prokaryotes the complete ribosome is 70S (30S small + 50S large); in eukaryotes it is 80S (40S small + 60S large). Each assembled ribosome has three tRNA-binding sites spanning both subunits: the A site (accepts incoming aminoacyl-tRNA), the P site (holds the growing peptidyl-tRNA), and the E site (the exit site for discharged tRNA).
Transfer RNA (tRNA) — the Adaptor Molecule: Each tRNA has two critical features: an anticodon loop that reads the mRNA codon by antiparallel complementary base pairing, and a 3′-CCA terminus that carries the cognate amino acid. Before a tRNA can participate in translation, its amino acid must be covalently attached by an aminoacyl-tRNA synthetase (one enzyme per amino acid, 20 in total). This “charging” reaction consumes the equivalent of two ATP molecules (Amino acid + ATP + tRNA → aminoacyl-tRNA + AMP + PPi), and the PPi is immediately hydrolysed to drive the reaction to completion. This step is where the fidelity of the genetic code is ultimately enforced — if the wrong amino acid is attached to a tRNA, that error will be propagated into the protein.
Initiation — Assembling the Ribosome on the mRNA IFs required
mRNA + small subunit + initiator tRNA → initiation complex → 70S / 80S ribosome
In prokaryotes, the small (30S) subunit, guided by initiation factors (IF-1, IF-2, IF-3), docks onto the mRNA via the Shine-Dalgarno sequence — a short purine-rich stretch sitting roughly 5–10 nucleotides upstream of the AUG start codon. This sequence is complementary to the 3′ tail of the 16S rRNA, anchoring the ribosome in precisely the right position so that AUG falls in the P site, ready to accept the initiator tRNA carrying N-formylmethionine (fMet). The 50S subunit then joins, completing the 70S initiation complex. In eukaryotes, the 40S subunit (with initiation factors including eIF4E) recognises the 5′ m⁷G cap and scans in the 3′ direction until it reaches the AUG start codon in a favourable Kozak consensus sequence context. The initiating amino acid is plain methionine (not formylmethionine). The 60S subunit then joins to form the 80S ribosome.
Elongation — Amino Acid Addition Cycle EF-Tu / EF-G (prokaryote)
A site aminoacyl-tRNA + P site peptidyl-tRNA → P site peptidyl-tRNA (extended by 1) + E site discharged tRNA
The elongation cycle repeats for every amino acid added and has three sub-steps. (1) Aminoacyl-tRNA binding: The next aminoacyl-tRNA enters the A site as a ternary complex with EF-Tu·GTP. Correct codon–anticodon base pairing triggers GTP hydrolysis and EF-Tu·GDP release, confirming the correct tRNA. (2) Peptide bond formation: Peptidyl transferase activity (intrinsic to the 23S rRNA in prokaryotes / 28S rRNA in eukaryotes — a ribozyme!) catalyses transfer of the growing polypeptide from the P-site tRNA to the amino group of the A-site amino acid, extending the chain by one residue. (3) Translocation: EF-G·GTP (EF-2·GTP in eukaryotes) drives the ribosome three nucleotides in the 3′ direction: the newly deacylated tRNA shifts P→E, the peptidyl-tRNA shifts A→P, and the next codon enters the A site. Each elongation cycle consumes 4 high-energy bonds (2 ATP for tRNA charging + 2 GTP for EF-Tu and translocation).
Termination — Stop Codon Recognition & Polypeptide Release Release Factors
Stop codon in A site + RF → polypeptide release + ribosome dissociation
When a stop codon (UAA, UAG, or UGA) enters the A site, no aminoacyl-tRNA can bind — instead, release factors (RF-1 recognises UAA/UAG; RF-2 recognises UAA/UGA in prokaryotes; eRF-1 recognises all three stops in eukaryotes) bind and trigger peptidyl transferase to transfer the polypeptide to a water molecule, hydrolysing it from the tRNA. The completed polypeptide, the tRNA, and the mRNA are all released, and the ribosomal subunits dissociate, ready for another round. In active cells, many ribosomes translate a single mRNA simultaneously, forming a polysome — maximising the rate of protein production from a single mRNA molecule.
6 Post-Translational Modifications and Protein Targeting
A newly synthesised polypeptide chain emerging from the ribosome is rarely the finished article. Several modifications are common and clinically important.
Proteins destined for export from the cell, or for membrane insertion, are synthesised on ribosomes associated with the rough endoplasmic reticulum (RER). They carry an N-terminal hydrophobic signal sequence that is recognised by the signal recognition particle (SRP), which docks the ribosome to the RER membrane. Once synthesis is complete, the signal sequence is cleaved by signal peptidase. Proteins for the cytosol, nucleus, mitochondria, and peroxisomes are synthesised on free ribosomes and are targeted to their destinations by internal signal sequences.
α₁-Antitrypsin (A1AT) is a serine protease inhibitor normally secreted by hepatocytes to protect lung tissue from neutrophil elastase. In the most common disease-causing variant (PiZZ genotype), a single amino acid substitution (Glu342Lys) causes the protein to misfold and aggregate in the ER of hepatocytes rather than being secreted. The liver accumulates toxic aggregates (causing cirrhosis), while the lungs are left unprotected from elastase — producing panacinar emphysema. This disease illustrates how a single missense mutation can derail the entire secretory pathway at the post-translational level.
7 Clinically Important Inhibitors of Transcription and Translation
Many antibiotics and toxins act by targeting the molecular machinery of gene expression. Understanding their mechanisms is testable and has direct clinical relevance for antibiotic choice and toxicology.
| Agent | Target | Mechanism | Clinical Note |
|---|---|---|---|
| α-Amanitin | RNA Pol II (eukaryote) | Tightly binds and blocks mRNA synthesis | Toxin of Amanita phalloides (death cap mushroom) — causes fulminant hepatic failure |
| Rifampicin | Prokaryotic RNA Pol (β subunit) | Blocks initiation of RNA synthesis | First-line anti-TB drug; does not affect eukaryotic RNA pol |
| Tetracyclines | 30S ribosomal subunit | Block aminoacyl-tRNA from binding the A site | Broad-spectrum; useful for chlamydia, mycoplasma, Lyme disease |
| Chloramphenicol | 50S ribosomal subunit (peptidyl transferase) | Inhibits peptide bond formation | Risk of aplastic anaemia limits use; active against 70S only |
| Erythromycin / Clindamycin | 50S ribosomal subunit | Block translocation | Macrolides useful for atypical pneumonias; clindamycin for anaerobes |
| Aminoglycosides | 30S ribosomal subunit (16S rRNA) | Cause misreading of codons → mistranslation | Gentamicin, tobramycin; nephrotoxic and ototoxic |
| Diphtheria toxin | EF-2 (eukaryotic elongation factor) | ADP-ribosylation inactivates EF-2 → elongation halts | Single molecule of toxin can kill a cell; vaccine-preventable |
| Puromycin | A site of ribosome (both 70S & 80S) | Mimics aminoacyl-tRNA; incorporated into chain, causing premature release | Not clinically used (too toxic); invaluable research tool |
| AZT / NRTIs | HIV reverse transcriptase | Chain terminator — lacks 3′-OH so no further extension of viral DNA | Cornerstone of antiretroviral therapy for HIV |
8 High-Yield Exam Summary
The flow: DNA → (Transcription) → mRNA → (Translation) → Protein. The exception is retroviruses (HIV): RNA → (Reverse Transcriptase) → DNA.
RNA polymerases: Prokaryotes = one RNA pol + σ factor. Eukaryotes: Pol I = rRNA (large); Pol II = mRNA (α-amanitin sensitive); Pol III = tRNA + 5S rRNA.
Promoter elements: Prokaryotes = −35 sequence + Pribnow box (−10). Eukaryotes = TATA box (−25), CAAT box (−75), GC box. TFIID binds TATA; TFIIH melts DNA + phosphorylates Pol II CTD.
mRNA processing (eukaryotes): 5′ m⁷G cap (protects + initiates translation) + splicing (spliceosome, GU-AG rule, lariat intermediate) + 3′ poly(A) tail (~200 A’s added after AAUAAA cleavage).
Genetic code: 64 codons total; 61 sense + 3 stop (UAA/UAG/UGA). Degenerate, unambiguous, nonoverlapping, virtually universal. Wobble occurs at the 3rd base of the codon (pairs with 1st base of anticodon).
Translation initiation: Prokaryotes = Shine-Dalgarno sequence binds 16S rRNA; fMet starts. Eukaryotes = cap-dependent scanning to AUG Kozak; Met starts.
Energy cost of translation: Each amino acid addition requires 4 high-energy bonds total — 2 ATP (for aminoacyl-tRNA synthetase) + 2 GTP (EF-Tu binding + translocation).
Key clinical links: A1AT deficiency (missense → RER misfolding → emphysema/cirrhosis); SLE (anti-snRNP antibodies = anti-Smith); Death cap mushroom (α-amanitin → Pol II inhibition → liver failure); HIV (NRTIs target reverse transcriptase); Warfarin (blocks Vit K-dependent γ-carboxylation of clotting factors).
9 Mnemonic Summary Wall
“I Make Really Good Proteins”
I (Pol I): rRNA (28S, 18S, 5.8S). II (Pol II): mRNA + most ncRNAs. Inhibited by α-amanitin. III (Pol III): tRNA, 5S rRNA, U6 snRNA.
“Cap, Clip, Clip” — Cap the 5′, Clip the introns, Clip & Poly-A the 3′
Cap: m⁷G 5′ cap added co-transcriptionally. Clip (introns): Spliceosome splices out introns (GU…AG rule; lariat intermediate). Clip & Poly-A: AAUAAA cleavage + ~200 A residues added.
“Silent Sam Makes No Sound, Mistakes Make Patients Suffer, Nonsense Never Finishes”
Silent: same amino acid (degeneracy; usually 3rd base). Missense: wrong amino acid (e.g. HbS: Glu→Val). Nonsense: premature stop codon → truncated protein.
“U Are Away, U Are Gone, U Go Away”
UAA: “U Are Away.” UAG: “U Are Gone.” UGA: “U Go Away.” All three recognised by release factors; none have a cognate aminoacyl-tRNA. Exception: UGA = Trp in mitochondria.
References
Kennelly, P. J., & Botham, K. M. (2017). Storage and expression of genetic information. In R. A. Harvey (Ed.), Lippincott’s illustrated reviews: Biochemistry (7th ed., pp. 395–448). Wolters Kluwer.
Hames, D., & Hooper, N. (2011). BIOS instant notes in biochemistry (4th ed., Sections F1, G1–G9, H1–H3). Taylor & Francis.
⚠ Disclaimer: The content on this page is intended for educational purposes only and is not a substitute for professional medical advice, clinical judgement, or the guidance of a qualified healthcare provider. Always refer to current clinical guidelines and consult appropriate sources before applying information in a patient care setting.
