What Is Biochemistry? A Medical Student’s Guide

Biochemistry is the molecular language of life — it explains why a single enzyme defect can kill a newborn, how a statin lowers your patient’s cholesterol, and why fasting shifts your body from glucose to ketone metabolism. Before you can master any of those pathways, you need a clear map of what biochemistry is and how its major themes connect.

This post gives you that map: a structured overview of biochemistry’s scope, the four classes of biomolecules, the logic of metabolism, and the clinical mindset you need to carry into every topic that follows.

🧠 Master Mnemonic — The Four Biomolecule Classes
Pretty Clever Life Now
PProteins — enzymes, structural, transport, immune
CCarbohydrates — fuel, signalling, glycoproteins
LLipids — membranes, hormones, energy storage
NNucleic acids — DNA (genetic blueprint) & RNA (executor)

1 What Is Biochemistry?

Biochemistry is the study of the chemical processes that occur within — and sustain — living organisms. Rather than treating biology and chemistry as separate disciplines, biochemistry asks: what specific molecules are present in a cell, how do they interact, and how do those interactions produce the phenomenon we call life?

Every organism on Earth — from a bacterium to a human — is built from the same fundamental chemical toolkit. This unity is not a coincidence. As Hames & Hooper note, “all organisms on Earth evolved from a common ancestor,” and in the prebiotic era simple organic molecules spontaneously polymerised into macromolecules, eventually enclosing themselves within phospholipid membranes to form the first cell. The conserved biochemistry across all life is the molecular fossil of that shared ancestry — which is precisely why a yeast enzyme can teach you about your own metabolism.

For medical students, biochemistry is not abstract chemistry. It is the mechanistic foundation beneath every disease, drug, and diagnostic test. Phenylketonuria is an enzyme deficiency. Diabetes is dysregulated fuel metabolism. Statins are enzyme inhibitors. Understanding the mechanism means you understand the disease — and its treatment — in a way that no amount of memorisation can match.

2 The Four Classes of Biomolecules

Living systems are built from four major classes of macromolecules. Each has a defined monomer–polymer relationship, a characteristic set of chemical bonds, and a repertoire of biological functions. Learning these four categories deeply is the single most important investment you can make before diving into individual pathways.

2.1 Proteins

Proteins are the most abundant and functionally diverse molecules in living systems. They are linear polymers of amino acids — linked by peptide bonds (covalent bonds between the α-amino group of one amino acid and the α-carboxyl group of the next). Although more than 300 amino acids exist in nature, only 20 standard amino acids are encoded by DNA and incorporated into proteins. Each carries a unique side chain (R-group) that determines its chemical character — charged, polar, nonpolar, or aromatic.

Protein structure is hierarchical. The primary structure (amino acid sequence) is genetically determined and dictates everything above it. Secondary structure (α-helices, β-sheets) arises from hydrogen bonding along the backbone. Tertiary structure is the full three-dimensional fold, stabilised by hydrophobic interactions, electrostatic forces, hydrogen bonds, van der Waals forces, and (where present) disulfide bonds. Quaternary structure describes multi-subunit assemblies such as haemoglobin.

Enzymes Biological catalysts that lower activation energy without being consumed. Each has an active site with geometric and chemical complementarity to its substrate.
Transport Proteins Haemoglobin carries O₂ and CO₂; albumin shuttles fatty acids, bilirubin, and drugs through the blood.
Structural Proteins Collagen forms the scaffolding for bone and connective tissue; actin and myosin generate contractile force in muscle.
Immunoglobulins Antibodies — Y-shaped proteins with variable regions that bind specific antigens with high affinity and specificity.
Signalling Proteins Receptors, G-proteins, kinases, and polypeptide hormones (e.g. insulin) transduce extracellular signals into intracellular responses.
Regulatory Proteins Transcription factors and repressors control gene expression by binding specific DNA sequences or other regulatory proteins.
⚕ Clinical Pearl — Protein Misfolding Disease

The primary structure of a protein dictates its fold. When this fold is disrupted — by mutation, oxidative stress, or chaperone failure — misfolded proteins can aggregate into insoluble fibrils. Alzheimer’s disease involves aggregation of amyloid-β peptide into plaques. Sickle cell disease results from a single amino acid substitution (Glu→Val at position 6 of the β-globin chain), which changes the surface chemistry of haemoglobin just enough to cause polymerisation under low-oxygen conditions. One amino acid. Profound consequences.

2.2 Carbohydrates

Carbohydrates serve as the primary fuel currency for most cells, as structural components (e.g. glycosaminoglycans in connective tissue), and as information-carrying molecules on cell surfaces (glycoproteins, glycolipids). The monomer is the monosaccharide — a polyhydroxy aldehyde or ketone. Glucose is the archetypal monosaccharide: a six-carbon aldose (C₆H₁₂O₆) that exists predominantly as a cyclic pyranose ring in solution.

Monosaccharides link through glycosidic bonds — covalent bonds between the anomeric carbon of one sugar and a hydroxyl group of another — to form disaccharides, oligosaccharides, and polysaccharides. The configuration at the anomeric carbon (α or β) determines whether the bond is digestible: humans have α-amylase to cleave α-glycosidic bonds in starch, but lack the enzyme to cleave the β(1→4) bonds of cellulose. Lactose contains a β-glycosidic bond between galactose and glucose; lactase deficiency is therefore the most common carbohydrate digestion disorder worldwide.

In complex carbohydrates, sugars are attached to proteins (glycoproteins) or lipids (glycolipids) via N-glycosidic bonds (to asparagine) or O-glycosidic bonds (to serine/threonine). These sugar coats are critical for cell recognition, immune function, and the targeting of lysosomal enzymes.

2.3 Lipids

Lipids are a chemically diverse class defined by their insolubility in water and solubility in non-polar solvents. They serve three broad roles: energy storage (triacylglycerols in adipose tissue), structural components of all biological membranes (phospholipids, cholesterol), and signalling molecules (steroid hormones, prostaglandins, phosphoinositides).

The backbone of most lipids is the fatty acid — a long hydrocarbon chain with a terminal carboxyl group. Saturated fatty acids have no double bonds; unsaturated fatty acids have one (monounsaturated, e.g. palmitoleate) or more (polyunsaturated, e.g. arachidonate). The degree of unsaturation determines membrane fluidity, melting point, and susceptibility to peroxidation. Cholesterol is a sterol — a rigid four-ring structure that modulates membrane fluidity and serves as the precursor for all steroid hormones, bile acids, and vitamin D.

⚕ Clinical Pearl — Lipids in Disease

Atherosclerosis begins when oxidised LDL (low-density lipoprotein) accumulates in arterial walls, triggering inflammation and plaque formation. Statins inhibit HMG-CoA reductase, the rate-limiting enzyme of cholesterol synthesis — driving cells to upregulate LDL receptors and clear LDL from circulation. Understanding the pathway explains the drug. Familial hypercholesterolaemia is caused by loss-of-function mutations in the LDL receptor gene itself, making lifestyle interventions insufficient and statins less effective as monotherapy.

2.4 Nucleic Acids

Nucleic acids are the information storage and transfer molecules of the cell. DNA (deoxyribonucleic acid) carries the genetic blueprint; RNA (ribonucleic acid) executes it. Both are polymers of nucleotides — each nucleotide consisting of a pentose sugar, a phosphate group, and a nitrogenous base — linked by 3′→5′ phosphodiester bonds.

DNA is a double helix stabilised by complementary base pairing (A–T via two hydrogen bonds; G–C via three) and base stacking. The sequence of base pairs is the genetic code — a triplet code where each codon of three nucleotides specifies one amino acid. RNA comes in several functional forms: mRNA carries the coding message from DNA to the ribosome; tRNA decodes the codons by bringing the matching amino acid; rRNA forms the catalytic core of the ribosome itself. This flow of information — DNA → RNA → Protein — is the Central Dogma of molecular biology.

3 Metabolism: The Logic of Living Chemistry

Metabolism is “the sum of all the chemical changes occurring in a cell, a tissue, or the body.” Rather than a chaotic soup of reactions, metabolism is organised into pathways — sequential reaction series in which the product of one reaction is the substrate of the next. Pathways can be classified as catabolic (degradative) or anabolic (synthetic), and they are connected into an integrated network where the output of one pathway feeds into another.

The central currency of metabolism is ATP (adenosine triphosphate). ATP consists of adenosine plus three phosphate groups; hydrolysis of each terminal phosphate releases approximately −7.3 kcal/mol of free energy, making ATP an extremely effective carrier of chemical energy. The cell uses ATP hydrolysis to drive otherwise unfavourable (endergonic) reactions — including biosynthesis, active transport, and muscle contraction.

3.1 The Three Stages of Catabolism

Catabolism is a convergent process: a diverse array of large molecules funnels down into a small number of common intermediates, ultimately driving ATP synthesis. The logic unfolds in three stages.

01

Stage I — Hydrolysis of complex molecules

Proteins → Amino acids | Polysaccharides → Monosaccharides | Triacylglycerols → Fatty acids + Glycerol

Macromolecules are broken down into their monomeric building blocks, primarily in the digestive system and in lysosomes. Little ATP is generated at this stage. The purpose is to produce substrates small enough to enter cells and downstream pathways.

02

Stage II — Conversion to common intermediates

Amino acids, monosaccharides, fatty acids → Acetyl-CoA (+ pyruvate, OAA, α-KG…)

Diverse building blocks are further degraded to acetyl CoA and a few other simple molecules (pyruvate, oxaloacetate, α-ketoglutarate). Some ATP and NADH are generated, but the yield is modest compared with Stage III. This convergence is why “you can make fat from carbohydrate” — both funnel into acetyl CoA.

03

Stage III — Oxidation via TCA cycle + Oxidative Phosphorylation

Acetyl-CoA + 3 NAD⁺ + FAD + ADP → 2 CO₂ + 3 NADH + FADH₂ + ATP

The TCA (tricarboxylic acid) cycle oxidises acetyl CoA to CO₂, harvesting electrons onto NADH and FADH₂. These electrons then flow down the electron transport chain, releasing energy that drives ATP synthase to produce the bulk of cellular ATP. This is by far the highest-yield stage — the reason aerobic organisms so dramatically outcompete anaerobes in energy terms.

3.2 Anabolism — Building Up

Anabolism is the mirror image of catabolism — a divergent process in which a few biosynthetic precursors are built up into a wide variety of complex molecules. Anabolic reactions are endergonic (energy-consuming) and are driven by ATP hydrolysis. Many also require reducing power in the form of NADPH, which is generated primarily by the pentose phosphate pathway. The key principle is metabolic reciprocity: catabolic and anabolic pathways share intermediates but use different enzymes at the committed steps, allowing them to be independently regulated.

💡 Mnemonic — Catabolism vs Anabolism

“CATabolism = CATastrophe (breaking down). ANAbolism = ANAbolic steroids (building up).”

Catabolism is oxidative, convergent, produces ATP and NADH. Anabolism is reductive, divergent, consumes ATP and NADPH. The currency differs: NADH feeds the ETC for ATP; NADPH drives biosynthetic reductions.

4 Metabolic Regulation — The Master Control System

A cell cannot run all pathways at full throttle simultaneously — that would be thermodynamically wasteful and biologically lethal. Metabolic flux is precisely regulated at multiple levels, from milliseconds to days.

01

Allosteric Regulation

Small molecules bind to regulatory (allosteric) sites on enzymes — distinct from the active site — and change the enzyme’s conformation and activity. This is the fastest regulatory mechanism (milliseconds). Classic example: ATP inhibits phosphofructokinase-1 (PFK-1) in glycolysis; AMP activates it. High ATP signals “enough energy — slow down”; high AMP signals “energy crisis — speed up.”

02

Covalent Modification (Phosphorylation)

Protein kinases add phosphate groups to serine, threonine, or tyrosine residues; phosphatases remove them. This reversible switch can activate or inhibit an enzyme within seconds. Glycogen phosphorylase is activated by phosphorylation (triggered by glucagon/epinephrine signalling); glycogen synthase is inactivated by the same phosphorylation cascade — ensuring glycogen is broken down, not synthesised, during stress.

03

Hormonal Regulation

Insulin signals the fed state: it activates glucose uptake, glycogen synthesis, lipid synthesis, and protein synthesis. Glucagon and epinephrine signal fasting/stress: they activate glycogen breakdown, gluconeogenesis, and lipolysis. These hormonal signals act through receptor cascades that ultimately modify key regulatory enzymes — coordinating metabolism across multiple tissues simultaneously.

04

Transcriptional Regulation

Over hours to days, cells adjust the amount of enzyme present by controlling gene expression. Steroid hormones, for example, enter the nucleus and directly alter transcription. SREBP (sterol regulatory element binding protein) increases transcription of cholesterol synthesis enzymes when cellular cholesterol is low. This is the slowest but most sustained regulatory mechanism.

💡 Mnemonic — Levels of Metabolic Regulation

“All Clever Humans Think”

Allosteric (fastest — seconds). Covalent modification (phosphorylation — seconds to minutes). Hormonal signalling (minutes to hours). Transcriptional control (hours to days, slowest).

5 The Central Dogma and Gene Expression

The Central Dogma describes the directional flow of genetic information: DNA → RNA → Protein. DNA is transcribed into messenger RNA (mRNA) by RNA polymerase; mRNA is then translated into protein at the ribosome. This one-way flow is not merely a textbook convention — it is the fundamental logic of how genotype produces phenotype.

In eukaryotes, there is an important intermediate step: the pre-mRNA is processed before it leaves the nucleus. Introns (non-coding sequences) are spliced out; exons (coding sequences) are joined together; a 5′ cap and poly-A tail are added. The resulting mature mRNA is exported to the cytoplasm for translation. Alternative splicing — in which different combinations of exons are joined — allows a single gene to encode multiple related proteins, enormously expanding the proteome beyond the ~20,000 protein-coding genes in the human genome.

💡 Mnemonic — mRNA Processing Steps

“5-CAP, SPLICE, POLY-A, EXIT”

5′ cap (7-methylguanosine) added first — protects from degradation and aids ribosome binding. Splice — introns removed, exons ligated by the spliceosome. Poly-A tail (50–250 adenylate residues) added to 3′ end — stabilises mRNA. Exit — mature mRNA exported through nuclear pore complex to cytoplasm.

6 Why Biochemistry Matters for USMLE / PLAB

Biochemistry questions on the USMLE Step 1 are almost never pure recall. They test mechanistic reasoning: given an enzyme deficiency, predict the metabolite that accumulates upstream. Given a hormonal state (fasting, fed, stressed), predict which pathways are active. Given a drug’s mechanism, predict its metabolic consequence or side-effect profile.

The framework that makes all of this tractable is exactly what this post introduces: know your four biomolecule classes, understand catabolic convergence and anabolic divergence, recognise the four tiers of regulation, and keep the Central Dogma as your north star. Every subsequent topic — glycolysis, the TCA cycle, fatty acid oxidation, urea cycle, nucleotide synthesis, and beyond — is an elaboration of these core principles.

Biomolecule Monomer Key Bond Main Roles High-Yield Pathology
Proteins Amino acids (20 standard) Peptide bond Enzymes, structural, transport, immune, signalling PKU, sickle cell, Alzheimer’s (misfolding)
Carbohydrates Monosaccharides Glycosidic bond (α or β) Fuel, glycoprotein/glycolipid coats, structural (GAGs) Lactase deficiency, galactosaemia, lysosomal storage diseases
Lipids Fatty acids / glycerol / cholesterol Ester bond (triacylglycerols); no bond (cholesterol) Energy storage, membrane structure, hormones Familial hypercholesterolaemia, atherosclerosis, fatty liver
Nucleic Acids Nucleotides 3′→5′ Phosphodiester bond Genetic information (DNA), gene expression (RNA) Xeroderma pigmentosum (DNA repair defects), HGPRT deficiency (Lesch-Nyhan)

7 High-Yield Exam Summary

📝 Exam High-Yield

The four biomolecule classes are proteins, carbohydrates, lipids, and nucleic acids. Each has a monomer linked by a characteristic bond.

Only 20 standard amino acids are encoded by DNA; only L-amino acids are found in proteins.

Catabolism is convergent (many → acetyl CoA); anabolism is divergent (few precursors → many products). They share intermediates but differ at committed enzymatic steps.

The three stages of catabolism: (I) hydrolysis to monomers → (II) conversion to acetyl CoA → (III) TCA cycle + oxidative phosphorylation = bulk ATP.

ATP hydrolysis (ΔG°′ ≈ −7.3 kcal/mol per terminal phosphate) drives endergonic reactions. NADH fuels the electron transport chain; NADPH powers biosynthetic reductions.

Regulatory hierarchy: allosteric > covalent modification > hormonal > transcriptional (fastest to slowest).

The Central Dogma: DNA → RNA → Protein. In eukaryotes, mRNA processing (5′ cap, splicing, poly-A tail) occurs before translation.

α-glycosidic bonds are hydrolysed by amylase (starch); β-glycosidic bonds in cellulose are not digestible by humans. Lactose has a β(1→4) bond — relevant in lactase deficiency.

8 Mnemonic Summary Wall

💡 Mnemonic — The Four Biomolecule Classes

“Pretty Clever Life Now”

Proteins. Carbohydrates. Lipids. Nucleic acids.

💡 Mnemonic — Catabolism vs Anabolism

“CATabolism = CATastrophe (breaking down). ANAbolism = ANAbolic steroids (building up).”

Catabolism: oxidative, convergent, produces ATP + NADH. Anabolism: reductive, divergent, consumes ATP + NADPH.

💡 Mnemonic — Levels of Metabolic Regulation

“All Clever Humans Think”

Allosteric (seconds). Covalent modification / phosphorylation (seconds–minutes). Hormonal (minutes–hours). Transcriptional (hours–days).

💡 Mnemonic — mRNA Processing

“5-CAP, SPLICE, POLY-A, EXIT”

5′ capSplicing (introns out) → Poly-A tailExit through nuclear pore.


References

Harvey, R. A., & Ferrier, D. R. (2011). Lippincott’s illustrated reviews: Biochemistry (5th ed.). Lippincott Williams & Wilkins. — Ch. 1 (Amino Acids, pp. 1–10), Ch. 2 (Protein Structure, pp. 13–24), Ch. 7 (Carbohydrates, pp. 83–90), Ch. 8 (Introduction to Metabolism & Glycolysis, pp. 91–94), Ch. 15 (Lipid Structure, pp. 183–192).

Hames, D., & Hooper, N. (2011). BIOS instant notes in biochemistry (4th ed.). Taylor & Francis. — Section A1 (Prokaryotic Cells & Origin of Life), Sections B1–B2 (Amino Acids & Protein Structure), Section E1 (Membrane Lipids), Section F1 (DNA Structure), Sections J3 & J4 (Glycolysis & Gluconeogenesis), Section K1 (Fatty Acid Structure), Sections L1–L2 (TCA Cycle & Oxidative Phosphorylation).

The content on this page is intended for educational purposes only and is not a substitute for professional medical advice, clinical judgement, or the guidance of a qualified healthcare provider. Always refer to current clinical guidelines and consult appropriate sources before applying information in a patient care setting.