Movie Ticket Booking HLD: The Seat Hold Is the Whole System Design
A movie ticket booking system design (BookMyShow-style): the data model, a seat-hold with a TTL, stopping double-booking with an atomic update, and idempotent payment at scale.
"Design BookMyShow." Or Ticketmaster, or the IRCTC seat map — any system where strangers pick from the same grid of seats at the same instant. It's one of the most-asked system-design questions because it hides a perfect trap: it looks like CRUD (list shows, pick seats, pay) but the entire interview lives in one sentence — two people just tapped seat A12 for the same show, a millisecond apart. Who gets it, how do you know, and what happens to the loser's screen?
If you've read the Phase-3 builds, you already know the shape of the answer — you just saw it at class level. The hotel's overlapping reservations, the inventory's plan-then-commit, the car rental's "reserve against a count": this is all of that, now with ten thousand clerks sharing one chart. This article is the bridge from "I can code a reservation" to "I can design one that survives a blockbuster on-sale," and it doubles as the template for every HLD that follows.
Let's start nowhere near a computer
Walk into an old theatre box office. On the wall is one big seating chart, and the clerk has a box of pins. You point at two seats; the clerk pushes a pin into each and says "these are yours — for eight minutes." While the pins are in, nobody else can be sold those seats, not even at the next window. You pay, the pins become ink (sold, permanent). You wander off instead? At minute nine the clerk quietly pulls the pins, and the seats are free again for the next person.
That pin is the entire design. It's not FREE and it's not SOLD — it's a third state, HELD, that exists only briefly and falls out on its own if you don't pay. Miss that the hold is a first-class thing with its own timer, and you'll build a system that either double-sells seats or locks them forever when someone closes their browser.
Where the pin-on-the-chart trick runs
- BookMyShow, Ticketmaster, IRCTC, airline seat maps — every one places a temporary hold the instant you select, long before you pay.
- E-commerce checkout, flash sales, hotel rooms — "we're holding this for you for 10:00" is the same pin; the inventory reservation you built is its class-level twin.
- The concurrency you already know — the LLD reservation race welded with
synchronizedbecomes, at scale, an atomic database write. Same bug, same cure, bigger blast radius.
Step 1 — Functional requirements (sentences first)
Every HLD starts the same way: write what the system must do as plain sentences. These are your functional requirements — the features, scoped out loud (the recipe).
- A user can browse shows for a movie in their city.
- A user can see which seats are free for a show.
- A user can hold specific seats for a few minutes while they pay.
- A user can pay to turn a hold into a confirmed booking.
- An unpaid hold expires on its own and frees the seats.
No reviews, no recommendations, no dynamic pricing yet — and saying that boundary out loud is the first senior move.
Step 2 — Non-functional requirements
Features tell you what to build; the non-functional requirements tell you how well — and they, not the features, are what actually shape the architecture. Say them out loud too:
- Low latency. Browsing and the seat map must feel instant (tens of milliseconds). A hold or a booking can take a beat longer, but still bounded — nobody waits ten seconds to grab a seat.
- Consistency. The one that decides the design. The seat map a user browses can be slightly stale (eventual consistency is fine — the truth is re-checked at hold time), but the hold and the booking must be strongly consistent: a seat sold twice is a refund, an apology, and a churned customer.
- High availability. Booking is revenue, so it must survive the spike; browsing may degrade (a slightly stale map) rather than fail outright.
- Durability. A confirmed booking and its payment can never be lost; a hold is ephemeral by design and may be dropped freely.
- Scalability. Thousands of shows a day, millions of users — and brutal spikes, where a blockbuster goes on sale and 200,000 people hit Buy in the same minute.
Listing requirements is the easy half; the design is only good if it meets them. So here's the contract this design signs — each requirement, and the one mechanism that keeps it (every row is cashed in a step below):
| Requirement | How this design fulfills it |
|---|---|
| Low latency (browse) | seat map + listings served from a cache, then read replicas — Step 10 |
| Strong consistency (hold/book) | the atomic conditional UPDATE + a UNIQUE constraint, in the DB — Step 6 |
| Eventual consistency (browse) | a short cache TTL; the seat map may lag a second, the DB is re-checked at hold time |
| High availability | stateless services behind a load balancer; DB replica failover; browse degrades — Step 11 |
| Durability | bookings + payments in SQL; holds are disposable, expiring on a TTL |
| Scalability / spikes | the incremental ladder + shard by show, a waiting room for the blockbuster — Step 10 |
Every trade-off we make from here on is chosen to keep one of these promises — and we'll point back at this table when we do.
Step 3 — Nouns and the data model
Circle the nouns: Show, Seat, Booking. Now the two that rookies miss, because they're relationships and lifetimes, not things you'd name first:
- show_seat — a seat's availability for one show. Seat A12 is a fixed piece of furniture; "A12 for the 9pm show" is what people fight over. That per-show row is the hot row of the whole system.
- seat_hold — the pin. A temporary claim on a show_seat with an
expires_at. It is the hidden noun the entire question is about.
shows (id, movie_id, screen_id, starts_at)
seats (id, screen_id, row, number) -- physical furniture
show_seats (id, show_id → shows, seat_id → seats,
status, price_paise) -- per-show availability: the hot row
seat_holds (id, show_seat_id → show_seats, user_id,
expires_at) -- the pin (hidden noun)
bookings (id, user_id, show_id, status,
amount_paise, created_at)
booking_seats (booking_id → bookings, show_seat_id → show_seats) -- hidden nounTwo details interviewers reward: price_paise is snapshotted onto show_seats (and the booking) so a price change tomorrow can't alter a sale today — the frozen-in-time rule. And money is paise as a long, never a float — the Splitwise lesson survives the zoom to HLD.
Which datastore — and why it isn't a default. Don't say "a database" and move on; the problem chooses the store, and saying which (and why) is the senior beat. Here the booking core is a textbook fit for a relational SQL database (Postgres / MySQL): the whole double-book fix depends on ACID transactions, the atomic conditional UPDATE, and a UNIQUE constraint — exactly the guarantees a relational engine hands you and a document or key-value store makes you rebuild by hand. So the strong-consistency requirement picks the database, not habit. The two things that don't fit SQL get their own home: the ephemeral holds and the seat-map cache want an in-memory store with native TTLs — Redis. Right tool per job: relational for the money and the seats, in-memory for the timers and the hot reads.
Step 4 — The seat is a three-state machine
FREE → HELD → BOOKED, with one arrow that makes this question hard: HELD → FREE, on a timer nobody clicks. A booked seat is permanent; a held seat is a promise with an expiry — which is why a hold can't just be a boolean on the row; it needs an expires_at.
But the machine above is per seat, and people book seats in groups — so the real subtlety is all-or-nothing across the group. When you hold [A12, A13], holding A12 and failing A13 is the worst outcome of all: you'd pay for one seat next to a stranger, or sit on a half-hold that blocks A12 from the family that wanted both. So the hold either flips every requested seat FREE → HELD or flips none — a tiny transaction. The implementation does this by checking all requested seats are free first and only then writing them, all inside one critical section (one DB transaction at scale); if any seat is taken, nothing moves and the caller re-picks the whole group. Partial success is not a weaker form of success here — it's the failure you most want to avoid.
Step 5 — Verbs become APIs (the API design)
GET /shows?city=..&movie=..&date=.. browse shows (paginated, cacheable)
GET /shows/{id}/seats the seat map (cacheable, short TTL)
POST /shows/{id}/holds place a hold (auth: signed-in user)
body: { seats: ["A12","A13"] } → { hold_id, expires_at }
POST /bookings confirm + pay (idempotent — see Step 7)
body: { hold_id } header: Idempotency-Key: <uuid>
GET /bookings/{id} your booking (auth: owner)The split that matters: holding and paying are two calls, not one. The hold reserves the seats before the slow, failure-prone payment step — so you're never charging for seats you might not get, and never holding furniture hostage during a 30-second card timeout.
Step 6 — The core algorithm: stopping the double-book
Here it is, the millisecond that decides the grade. Two users hold A12 at once.
The wrong answer is read-then-write: "check if A12 is free, then mark it held." That's the check-then-act race you've met in every concurrency article — between the check and the write, the other user slips in, and you've sold one seat twice. At class level you welded it shut with synchronized. At scale, across many servers, there's no shared synchronized block — so you push the check-and-set into the database, as one atomic statement:
UPDATE show_seats
SET status = 'HELD', held_by = :holdId, expires_at = :exp
WHERE id = :showSeatId
AND (status = 'FREE' OR (status = 'HELD' AND expires_at < :now));The WHERE clause is the check; the SET is the act; the database guarantees they happen together. Run it from two servers and the DB serializes them: one returns "1 row updated" (you won the seat), the other "0 rows" (someone beat you — show them the map again). No application lock, no race — the seat row's own atomicity is the referee. A UNIQUE constraint on the confirmed booking_seats(show_seat_id) is the final backstop: even if two holds somehow both reached payment, the database physically refuses to book the same show-seat twice.
This is exactly where the functional requirement ("hold specific seats") and the non-functional one (strong consistency — never sell a seat twice) are both cashed — and notice the cost we didn't pay: because the check-and-set is one indexed write, not a lock acquired across the network, we keep correctness without surrendering the low-latency requirement. The cheapest correct mechanism that honours both promises is the one to reach for; that's the whole game.
The seductive wrong turn is a distributed lock (“grab a Redis lock on A12, then check, then write”). It works, but it adds a moving part that can fail, expire mid-write, or get partitioned — and you still need the DB constraint as a backstop anyway. If the database can do the check-and-set atomically (and it can), the lock is a second mechanism earning its keep only at far higher scale. Name it as a deliberate upgrade, not a default.
Step 7 — Two more places it goes wrong: expiry and double-charge
The hold must expire — lazily. You do not run a cron job deleting expired holds (that's a thundering scan, and there's always a gap). Instead the expires_at < :now clause above means an expired hold is simply treated as free the next time anyone wants the seat — lazy expiry, the same trick the rate limiter and alarm service use. The pin "falls out" not on a schedule, but the instant someone reaches for the seat.
Payment must be idempotent. Networks retry: the user double-taps Pay, the gateway's webhook fires twice, the mobile app resends on a flaky connection. If POST /bookings runs twice, you must charge once and book once. The cure is an idempotency key the client generates and sends; the server records it with the booking, and a repeat of the same key returns the same booking instead of charging again. This is the plan → debit → commit ordering with a receipt stapled on.
And the harder failure hiding behind that, the one the interviewer pushes toward: the charge succeeds but the booking write fails (the gateway returned OK, then your DB hiccupped before you marked the seats BOOKED). Now there's money taken with no ticket — an orphaned payment no retry of the same idempotency key can heal, because the booking row was never written. Two defenses, and you should name both. First, order the steps so the charge is the last thing: hold the seats, then charge, then mark booked — a crash before charging just lets the hold expire harmlessly. Second, run a reconciliation sweep that joins recent gateway charges against bookings and either completes the booking or refunds the dangling charge. The idempotency key stops double charges; reconciliation stops orphaned ones — different problems, different fixes.
Step 8 — Trade-offs (each one keeping an NFR)
Notice the last column: every decision is made to keep one of the promises from Step 2. That's what "design with the non-functional requirements in mind" actually looks like — not a list you wrote and forgot, but the thing each choice is accountable to.
| Decision | The tempting alternative | Why ours wins | Keeps |
|---|---|---|---|
atomic conditional UPDATE | read seat, then write if free | closes the check-then-act race with zero extra moving parts | consistency |
hold as a row with expires_at | a boolean is_held flag | a boolean can't expire; closed browsers would lock seats forever | availability |
lazy expiry (expires_at < now) | a cron job sweeping old holds | no scan, no gap; the seat frees exactly when it's next wanted | latency |
idempotency key on POST | trust the client to call once | retries and double-webhooks stop double-charging and double-booking | consistency |
| hold then pay (two calls) | one call that holds + charges | never charge for seats you might lose, nor hold seats during a charge | availability |
| cache the seat map, short TTL | always read the live DB row | reads scale and stay fast; the brief staleness is re-checked at hold | latency, scale |
The complete implementation
The seat-hold core, in the small — an injected Clock makes the TTL testable, and the single synchronized block stands in for the database's atomic conditional update (one box now; one UPDATE … WHERE across many boxes later):
package dev.fiveyear.booking;
import java.time.Instant;
import java.util.List;
public enum SeatStatus {
FREE,
HELD,
BOOKED
}
/** The pin: a temporary claim on a set of show-seats, with an expiry. */
record Hold(String holdId, String showId, List<String> seats, String userId, Instant expiresAt) {
Hold {
seats = List.copyOf(seats);
}
}
/** Ink: the seats a user paid for, the amount frozen in paise. */
record Booking(String bookingId, String userId, String showId, List<String> seats, long amountPaise) {
Booking {
seats = List.copyOf(seats);
}
}package dev.fiveyear.booking;
import java.time.Duration;
import java.time.Instant;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.UUID;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.Supplier;
/** The box office: holds seats atomically, lets pins expire, confirms idempotently. */
public final class SeatHoldService {
public static final class HoldRejectedException extends RuntimeException {
public HoldRejectedException(String message) {
super(message);
}
}
public static final class ConfirmRejectedException extends RuntimeException {
public ConfirmRejectedException(String message) {
super(message);
}
}
/** One row per (show, seat) — the contended row the DB would store. */
private static final class SeatRecord {
SeatStatus status = SeatStatus.FREE;
String holdId;
Instant holdExpiresAt;
}
private final Map<String, SeatRecord> seats = new HashMap<>();
private final Map<String, Hold> holds = new HashMap<>();
private final Map<String, Booking> byIdempotencyKey = new HashMap<>();
private final AtomicInteger charges = new AtomicInteger();
private final Duration ttl;
private final Supplier<Instant> clock;
public SeatHoldService(Duration ttl, Supplier<Instant> clock) {
this.ttl = ttl;
this.clock = clock;
}
private static String key(String show, String seat) {
return show + "|" + seat;
}
private SeatRecord record(String show, String seat) {
return seats.computeIfAbsent(key(show, seat), k -> new SeatRecord());
}
/** Lazy expiry: a lapsed HELD is, in effect, FREE. */
private boolean effectivelyFree(SeatRecord r, Instant now) {
if (r.status == SeatStatus.BOOKED) {
return false;
}
return r.status != SeatStatus.HELD || !r.holdExpiresAt.isAfter(now);
}
public synchronized SeatStatus statusOf(String show, String seat) {
SeatRecord r = record(show, seat);
if (r.status == SeatStatus.HELD && !r.holdExpiresAt.isAfter(clock.get())) {
return SeatStatus.FREE; // the pin has fallen out
}
return r.status;
}
/** All-or-nothing hold — the atomic conditional UPDATE, here as one critical section. */
public synchronized Hold hold(String show, List<String> wanted, String userId) {
Instant now = clock.get();
for (String seat : wanted) {
if (!effectivelyFree(record(show, seat), now)) {
throw new HoldRejectedException("seat taken: " + seat);
}
}
String holdId = UUID.randomUUID().toString();
Instant expiresAt = now.plus(ttl);
for (String seat : wanted) {
SeatRecord r = record(show, seat);
r.status = SeatStatus.HELD;
r.holdId = holdId;
r.holdExpiresAt = expiresAt;
}
Hold hold = new Hold(holdId, show, wanted, userId, expiresAt);
holds.put(holdId, hold);
return hold;
}
/** Idempotent confirm: a repeated key returns the same booking and never charges twice. */
public synchronized Booking confirm(String holdId, String idempotencyKey, long amountPaise) {
Booking already = byIdempotencyKey.get(idempotencyKey);
if (already != null) {
return already; // a retry — not a second charge
}
Hold hold = holds.get(holdId);
if (hold == null) {
throw new ConfirmRejectedException("unknown hold");
}
Instant now = clock.get();
if (!hold.expiresAt().isAfter(now)) {
throw new ConfirmRejectedException("hold expired");
}
for (String seat : hold.seats()) {
SeatRecord r = record(hold.showId(), seat);
if (r.status != SeatStatus.HELD || !holdId.equals(r.holdId)) {
throw new ConfirmRejectedException("hold no longer valid");
}
}
for (String seat : hold.seats()) {
SeatRecord r = record(hold.showId(), seat);
r.status = SeatStatus.BOOKED;
r.holdId = null;
r.holdExpiresAt = null;
}
charges.incrementAndGet();
Booking booking =
new Booking(UUID.randomUUID().toString(), hold.userId(), hold.showId(), hold.seats(), amountPaise);
byIdempotencyKey.put(idempotencyKey, booking);
holds.remove(holdId);
return booking;
}
public int charges() {
return charges.get();
}
}At the box office (the clock is injected, so eight minutes pass on command):
Instant t0 = Instant.parse("2026-06-13T18:00:00Z");
Instant[] clock = {t0};
SeatHoldService box = new SeatHoldService(Duration.ofMinutes(8), () -> clock[0]);
Hold h = box.hold("show-42", List.of("A12", "A13"), "asha"); // both HELD, expire 18:08
box.statusOf("show-42", "A12"); // HELD
box.hold("show-42", List.of("A12"), "dev"); // throws HoldRejectedException — A12 is pinned
Booking b = box.confirm(h.holdId(), "pay-asha-1", 50000); // both seats BOOKED, ₹500 charged once
box.confirm(h.holdId(), "pay-asha-1", 50000); // replay of the SAME key → same booking, no 2nd charge
box.charges(); // 1
Hold m = box.hold("show-42", List.of("B5"), "mira"); // B5 HELD, expires 18:08
clock[0] = t0.plusSeconds(9 * 60); // nine minutes drift by, unpaid
box.statusOf("show-42", "B5"); // FREE — the pin fell out on its own
box.hold("show-42", List.of("B5"), "ravi"); // succeeds: B5 is free againStep 9 — Only now, the boxes
The architecture draws itself once you ask each endpoint "read-heavy or write-heavy?" — the boxes-last rule.
- Browsing shows and seat maps is overwhelmingly read-heavy → a cache in front, with a short TTL on the seat map so it's never wildly stale (the truth is always the DB row).
- Holding and booking are writes that must be exactly-once → the SQL database, where the atomic
UPDATEand theUNIQUEconstraint live. This is not the place to be clever; it's the place to be correct. - Holds can live in the same DB, or in a TTL store like Redis for speed — either way the rule is the same: the hold owns an expiry.
- Payment is a slow external call, so the flow is hold → call gateway → confirm on success; a queue fans out the "your tickets are booked" SMS and email off the request path.
Step 10 — Scaling the design, one bottleneck at a time
A junior over-builds on day one — sharding and replicas for traffic that doesn't exist yet. The senior move is the opposite: start with the simplest thing that's correct, and add a piece only when a measured bottleneck forces you. Climb the ladder, driven by the read-heavy browse path your non-functional requirements flagged:
- One database. Correct, simple, and plenty for a single cinema. Reads start to hurt long before writes do — browsing dwarfs booking — so when the seat-map reads bite…
- Add a cache. Put the seat map and show listings behind a cache with a short TTL. Most reads never touch the DB; the database is left free for the writes that must be correct. When reads outgrow even the cache…
- Add read replicas. Fan browse traffic across replicas while the primary keeps the holds and bookings. When data and write volume finally outgrow one box…
- Shard by
show_id. Seats and bookings partition cleanly by show — most shows are quiet, so load spreads across shards. And only as a genuine last resort do you migrate to a different datastore; you'll be surprised how far the four rungs above carry you first.
Each rung is more to operate, so you earn it with a number, not a hunch — that restraint is itself the senior signal.
The write hot-key is a different axis. Sharding spreads quiet shows beautifully, but one blockbuster is a single hot shard no partitioning can split — 200,000 people fighting over one show's seats.
That's where a virtual waiting room comes in: admit users in metered batches through a queue ("you're number 12,000 in line") so the stampede never hits the seat-map and hold path at full force — your rate limiter grown up into a fairness gate. And the truth that makes the hot show survivable at all: contention is self-limiting. A 300-seat hall can only produce 300 winning holds no matter how many push Buy — every other request gets a fast "0 rows, pick again." Make the losing path cheap and the seat map readable; you were never going to let everyone win.
Step 11 — When a piece fails: designing for failure
A design isn't finished when it works — it's finished when you can say what happens as each box dies. Go component by component, and notice how much of the answer the design already handed us for free:
- The cache dies. Reads fall back to the database — slower, but still correct, because the cache was only ever an optimization over the authoritative row. Browsing degrades; booking is untouched. (This is the payoff of "the truth is always the DB row.")
- The payment gateway dies. You can't confirm, so you don't — and the hold simply expires on its TTL, the seats free themselves, and nobody is charged. Two-phase hold-then-pay turns a payment outage into a non-event instead of a pile of orphaned charges.
- A booking-service instance dies mid-request. The idempotency key and the reconciliation sweep from Step 7 heal it: the client retries, the key returns the real outcome, and the sweep completes or refunds anything left dangling.
- The primary database dies. The one failure you can't degrade away, because holds and bookings need it. So you keep a replica and fail over to it; and until failover completes you fail fast — reject new bookings cleanly rather than take payments you can't record.
The pattern across all four is the lesson: a dependency that's only an optimization (the cache) degrades; a dependency that holds the truth (the DB) gets a replica and a fail-fast guard; anything slow and external (payments) is arranged so its outage simply expires. Designing for failure isn't preventing every outage — it's deciding, in advance, how the system bends instead of breaks.
The interview corner
Clarify before you draw: How long is a hold (5–10 min is typical)? Can a user pick exact seats, or just "3 seats together" (auto-assignment changes the hold logic)? Is payment in scope (it brings idempotency) or assumed? What are the consistency and availability targets — what may be stale and cached vs. what must hit the DB (this decides half the architecture)? One city or global (it decides whether you shard by show or also by region)?
The follow-up ladder:
- "Two users tap A12 in the same millisecond — walk me through it." The atomic
UPDATE … WHERE status='FREE': the DB serializes, one row changes, the other gets zero and is told to re-pick. This is the answer; lead with it. - "A user holds seats and closes the tab." Lazy expiry:
expires_at < nowmeans the next person to want the seat treats it as free. No cron, no leak. Contrast with a sweeper job and say why lazy wins. - "The payment webhook fires twice." Idempotency key recorded with the booking; the second call returns the first booking and never re-charges. Mention that the gateway itself should be called idempotently too.
- "Blockbuster, 200k concurrent." Shard by show, a virtual waiting room to meter the stampede, cache the seat map; note that winning holds are capped by seat count, so the system is self-limiting.
- "Hold the seat AND charge in one call — simpler, no?" No: you'd charge for seats you might lose to the race, or hold furniture hostage during a 30-second card timeout. The two-phase hold-then-pay is the whole reason the design is humane. Naming why the obvious shortcut is wrong is the senior signal.
- "Reads are spiking but writes are fine — what do you add first?" Cache the seat map, then read replicas, then shard — in that order, each step earned by a real bottleneck. The trap is reaching for sharding on day one; name the ladder and explain why you climb it slowly.
- "The payment gateway is down for ten minutes." Bookings pause, but nobody is harmed: in-flight holds expire on their TTL and release their seats, no card is charged, and browsing is untouched. When the gateway returns, business resumes — and the reconciliation sweep settles anything that was mid-charge. Degrade, don't collapse.
Mistakes that fail the round: read-then-write seat checks (the classic double-book); a boolean is_held with no expiry (seats locked forever by closed tabs); a cron sweeper instead of lazy expiry; charging before the seat is secured; storing money as a float; one giant table with no sharding plan when asked about the blockbuster.
Where to go from here
Pocket version: the seat is FREE/HELD/BOOKED; the hold is a pin with an expiry; an atomic conditional UPDATE makes the double-book impossible; lazy expiry frees abandoned holds; an idempotency key makes payment exactly-once; shard by show and meter the stampede with a waiting room.
- Build the auto-assignment variant — "give me 3 seats together" — and watch the hold become a small search over the seat map before the atomic claim.
- Add dynamic pricing and notice it touches only
price_paiseonshow_seats, snapshotted at booking — the model already had the seam. - Zoom back down: the class-level twin of this whole question is the inventory management LLD and the hotel reservation LLD — same race, same cure, one altitude lower.
- Next in the queue: ride-sharing and Twitter — where the hot row becomes a hot region and a hot timeline, and the same "make the losing path cheap" instinct scales to the whole feed.
One seating chart, a box of pins that fall out on their own, and a database that refuses to sell the same seat twice — that's a ticket-booking system, and the millisecond you just learned to survive is the one the interviewer was always asking about.