Call Center LLD: Dispatch, Escalation, and the Freed Agent
A low-level design walkthrough of a call center: rank-based dispatch across respondents, managers and a director, escalation that re-dispatches, queues per rank, and the freed agent who drains them.
"Design a call center." It's the CTCI classic: respondents take calls, managers take what respondents can't, a lone director takes what managers can't. Sounds like an inheritance exercise — Manager extends Employee — and that's the first trap. The second is sneakier: everyone designs the assignment path and forgets the question that actually runs a call center: when an agent hangs up, who decides what they do next?
Dispatch, escalation, and the freed agent: three small mechanisms, one design. And after the elevator, you'll recognize the genre — this is scheduling again, except the resources have job titles.
Let's start nowhere near a computer
Watch an airport help desk on a stormy evening. Rebookings go to any free desk agent. A passenger demanding a refund waiver gets walked to the supervisor's counter. The supervisor's unsolvable case goes to the duty manager's office. Three tiers, smaller as you go up — and notice two things about how it actually flows.
First: when the supervisor is busy, the waiver passenger waits at the supervisor's queue — they don't go back to the desk agents, who can't help them. Second, the load-bearing one: the moment any agent finishes, they don't stand idle — they look at the waiting line and call "next!" The system's pulse isn't arrivals; it's completions.
Where this dispatch shape runs
- Support ticket systems — L1/L2/L3 engineering support is this design with email instead of phones.
- Hospital triage — nurse, resident, attending: capability tiers with escalation.
- The thread pool, next in the queue — identical skeleton, but every worker has the same rank; the call center is a thread pool where workers differ.
Step 1 — Functional requirements (sentences first)
What the center must do, as plain sentences — the functional requirements.
- An incoming call needs an agent of at least some rank (most need any respondent).
- A call is assigned to a free agent of the lowest sufficient rank — directors don't take password resets.
- If nobody sufficient is free, the call waits in a queue for its required rank — it is never dropped.
- An agent who can't resolve a call escalates it: the requirement rises, the call re-dispatches, the agent frees up.
- When an agent finishes, they immediately take the most demanding waiting call they're allowed to.
That third sentence is a deliberate stance, and it's the same one the thread pool takes about its rack: when every agent is busy, you queue, you don't drop. A dropped call is a lost customer; backpressure here means making the caller wait, not hanging up on them. The only honest failure is the one nobody can serve at any rank — and that's a named edge, not a silent loss.
Step 2 — Non-functional requirements
At class level the non-functional requirements are different words for the same idea — how well, not just what — and for a call center they are the design:
- Correctness. A call reaches a qualified, free agent; one agent serves one call at a time; no accepted call is ever dropped. This is the whole reason the thing is hard.
- Fairness & throughput. The longest-waiting call at a rank is served next (FIFO per rank), and every free agent is put to work — the producer-consumer shape with humans as the workers, so no agent sits idle while a call waits.
- Thread-safety. Concurrent incoming calls race for the same free agent; two dispatches must never claim one agent, and a completion must never lose a queued call.
- Extensibility. New escalation tiers and routing rules plug in without rewriting dispatch — the requirement-bump-and-redispatch seam, not a web of special handoff paths.
Listing them is the easy half; the design only earns them if it fulfills them. Here's the contract — each requirement and the mechanism that keeps it:
| Requirement | How this design fulfills it |
|---|---|
| Correctness | dispatch claims a free agent of the lowest sufficient rank under the lock; unservable calls queue — Steps 4, 6 |
| Fairness | per-rank FIFO queues; the freed agent pulls most-demanding-first so escalations aren't starved — Steps 3, 6 |
| Throughput | every completion immediately drains a queue — the heartbeat keeps all agents utilized — Step 6 |
| Thread-safety | one lock guards the claim, so concurrent calls never grab the same agent — Step 4 |
| Extensibility | escalation is a requirement bump that re-runs the same dispatch — new tiers are enum entries — Step 5 |
Every trade-off below is chosen to keep one of these.
Step 3 — Nouns: the inheritance trap, and the hidden noun
The textbook solution subclasses: Respondent, Manager, Director. Run the series' test: do they behave differently, or just rank differently? They answer phones identically — only their level differs. Data, not behavior:
public enum Rank { RESPONDENT, MANAGER, DIRECTOR } // ordinal = authority
public static final class Employee {
final String name;
final Rank rank;
Call current; // null = free
}And the hidden noun, found the usual way: what has the lifecycle here? Not the employee — the Call: it's created, waits, gets assigned, escalates, ends. Caller, required rank, assignee, status — a record with a life, like every ticket and PNR before it.
Which structure — and why. Agents are grouped by rank so dispatch can scan tiers in authority order — the enum's ordinal is the tier, which keeps the lowest-sufficient scan a one-liner and makes "add a tier" an extensibility win. The wait area is one bounded-in-spirit Deque per rank (an EnumMap<Rank, Deque<Call>>), not a single shared line: a Deque gives FIFO from the front for fairness while letting the freed agent pull from the rank it's allowed to — the thread pool's bounded queue again, except patience is measured in held-call minutes, not milliseconds, and there's a queue per tier instead of one. And the free-agent claim happens under a single lock so two concurrent calls can't both grab the same agent — the thread-safety NFR made physical. Each choice is tied to a promise above: the enum to extensibility, the per-rank deques to fairness, the lock to thread-safety.
Step 4 — Dispatch: lowest sufficient rank
public synchronized void dispatch(Call call) {
for (Rank rank : Rank.values()) { // RESPONDENT → MANAGER → DIRECTOR
if (rank.ordinal() < call.requiredRank().ordinal()) continue;
Optional<Employee> free = firstFree(rank); // lowest sufficient rank wins
if (free.isPresent()) {
assign(call, free.get());
return;
}
}
waiting.get(call.requiredRank()).addLast(call); // queue at the REQUIRED rank
call.markWaiting();
}Two graded decisions in eight lines. Lowest sufficient rank keeps the director free for director problems — assign upward only when the tier below is saturated. And a waiting call queues at its required rank, not wherever there's room: the waiver passenger stands at the supervisor's counter.
Step 5 — Escalation is a requirement bump, not a transfer
The rookie models escalation as "hand this call to that manager" — which silently fails when no manager is free. The clean model: escalation raises the call's requirement and re-runs dispatch.
public synchronized void escalate(Call call) {
call.raiseRequirement(); // RESPONDENT → MANAGER → DIRECTOR (caps at the top)
freeUp(call.assignee()); // the respondent is released — and drains a queue!
dispatch(call); // same fairness, same queues, no special path
}One mechanism, reused. If a manager is free, the call lands there; if not, it waits in the manager queue like anything else — no VIP back-channel to invent and no edge cases to forget.
Step 6 — The freed agent: the system's heartbeat
And the question from the airport: agent hangs up — now what? They pull from the queues, most demanding first among those they're allowed to serve, so escalated calls aren't starved by a stream of fresh easy ones:
public synchronized void endCall(Call call) {
call.markEnded();
Employee agent = call.assignee();
agent.current = null;
for (int r = agent.rank.ordinal(); r >= 0; r--) { // their level first, then below
Call next = waiting.get(Rank.values()[r]).pollFirst();
if (next != null) {
assign(next, agent);
return;
}
}
}This is the restaurant's table-flip again — the completion triggers the next assignment. Build dispatch without this and your call center fills up exactly once and dies quietly.
The downward scan order is a policy choice worth narrating: a freed manager serves the manager queue first (escalations are urgent), then helps with respondent overflow. The opposite order — help the masses first — is also defensible. Naming the knob beats defending a hardcoding.
Step 7 — Trade-offs (each one keeping an NFR)
The last column is the discipline: every choice keeps one of the promises from Step 2 — that's what designing to the non-functional requirements looks like.
| Decision | The tempting alternative | Why ours wins | Keeps |
|---|---|---|---|
enum rank + one Employee class | Manager extends Respondent | behavior doesn't vary, only level; new tiers are one enum line | extensibility |
| escalation = requirement bump, redispatch | hand the call to a named manager | no special path that silently fails when no manager is free | extensibility |
| queue at the required rank, never drop | drop the call when all are busy | an accepted call is never lost; the waiver passenger waits, not hangs up | correctness |
| one lock around the free-agent claim | lock-free dispatch | two racing calls can't both grab one agent — honest at this scale | thread-safety |
| completion pulls the queue, most-demanding-first | assign only on arrival | every freed agent is reused and escalations don't starve | fairness / throughput |
Growth — when one center isn't enough. The first knobs are the tier count and the pull order (manager-queue-first vs. overflow-first). When concurrency outgrows the single lock, you don't make dispatch lock-free — you shard by department, each with its own agents and queues, the same instinct as splitting a contended lock into smaller ones. The edges to name before any of that: all agents busy (the call queues, it does not drop); no agent at any tier for a requirement (overflow to voicemail or a callback task, an honest "we can't serve this now" rather than a silent loss); and a call abandoned while waiting (the caller hangs up — evict it from its rank queue so a freed agent never pulls a dead line).
The complete implementation
package dev.fiveyear.callcenter;
import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.Deque;
import java.util.EnumMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
public final class CallCenter {
public enum Rank { RESPONDENT, MANAGER, DIRECTOR }
public static final class Employee {
final String name;
final Rank rank;
Call current;
Employee(String name, Rank rank) {
this.name = name;
this.rank = rank;
}
public boolean isFree() {
return current == null;
}
public String name() {
return name;
}
}
public static final class Call {
public enum Status { NEW, WAITING, ACTIVE, ENDED }
private final String caller;
private Rank requiredRank = Rank.RESPONDENT;
private Employee assignee;
private Status status = Status.NEW;
public Call(String caller) {
this.caller = caller;
}
void raiseRequirement() {
int next = Math.min(requiredRank.ordinal() + 1, Rank.DIRECTOR.ordinal());
requiredRank = Rank.values()[next];
}
void markWaiting() { status = Status.WAITING; }
void markEnded() { status = Status.ENDED; }
public Rank requiredRank() { return requiredRank; }
public Employee assignee() { return assignee; }
public Status status() { return status; }
}
private final List<Employee> employees = new ArrayList<>();
private final Map<Rank, Deque<Call>> waiting = new EnumMap<>(Rank.class);
public CallCenter() {
for (Rank r : Rank.values()) {
waiting.put(r, new ArrayDeque<>());
}
}
public void hire(String name, Rank rank) {
employees.add(new Employee(name, rank));
}
public synchronized void dispatch(Call call) {
for (Rank rank : Rank.values()) {
if (rank.ordinal() < call.requiredRank().ordinal()) continue;
Optional<Employee> free = firstFree(rank);
if (free.isPresent()) {
assign(call, free.get());
return;
}
}
waiting.get(call.requiredRank()).addLast(call);
call.markWaiting();
}
public synchronized void escalate(Call call) {
call.raiseRequirement();
Employee agent = call.assignee;
call.assignee = null;
dispatch(call);
if (agent != null) {
agent.current = null;
pullFromQueues(agent);
}
}
public synchronized void endCall(Call call) {
call.markEnded();
Employee agent = call.assignee;
call.assignee = null;
if (agent != null) {
agent.current = null;
pullFromQueues(agent);
}
}
private void pullFromQueues(Employee agent) {
for (int r = agent.rank.ordinal(); r >= 0; r--) {
Call next = waiting.get(Rank.values()[r]).pollFirst();
if (next != null) {
assign(next, agent);
return;
}
}
}
private void assign(Call call, Employee agent) {
call.assignee = agent;
call.status = Call.Status.ACTIVE;
agent.current = call;
}
private Optional<Employee> firstFree(Rank rank) {
return employees.stream()
.filter(e -> e.rank == rank && e.isFree())
.findFirst();
}
}The stormy evening, replayed:
CallCenter center = new CallCenter();
center.hire("riya", Rank.RESPONDENT);
center.hire("arun", Rank.RESPONDENT);
center.hire("meera", Rank.MANAGER);
Call c1 = new Call("rebooking"); center.dispatch(c1); // → riya
Call c2 = new Call("baggage"); center.dispatch(c2); // → arun
Call c3 = new Call("seat change"); center.dispatch(c3); // respondents busy → meera (sufficient)
Call c4 = new Call("upgrade"); center.dispatch(c4); // everyone busy → WAITING
center.escalate(c1); // requirement → MANAGER; meera busy → manager queue.
// riya frees — and immediately picks up c4!
center.endCall(c3); // meera frees → pulls c1 from HER queue first
// c1.assignee() == meera, c4.assignee() == riya — completions drove everythingThe interview corner
Clarify before you code: Is escalation caller-driven or agent-driven? Can a director take a fresh easy call when everyone else is busy (we said yes — lowest sufficient)? Do queued callers time out?
The follow-up ladder:
- "Why not
Manager extends Respondent?" Behavior doesn't vary — rank does. An enum plus one Employee class survives "add a TEAM_LEAD tier" with one line; the hierarchy version triggers a refactor. - "Fairness for queued calls." FIFO within a rank (built), most-demanding-first across ranks on free-up — then name the missing axis: wait-time aging, so an old respondent call eventually beats a fresh manager call.
- "Skills, not just ranks." Language, product area: the rank ladder becomes a skill set match — dispatch turns into bipartite matching, greedy first. The freed-agent heartbeat is unchanged.
- "Thousands of concurrent calls." One lock per center is honest here (the vending machine's answer); shard by department before inventing lock-free dispatch.
- "Callbacks instead of holding." The queued call becomes a task with a phone number — and you've discovered that hold queues and thread-pool work queues are the same structure with different patience.
Mistakes that fail the round: a class hierarchy that exists only to express rank; assigning calls to agents with nothing pulling queues on completion (the call center that dies after lunch); escalation as a special handoff path instead of requirement-bump-plus-redispatch.
Where to go from here
Pocket version: rank is data, the call is the hidden noun, dispatch takes the lowest sufficient rank, escalation re-dispatches, and completions — not arrivals — are the heartbeat.
- Add wait-time aging to the queue pull and watch starvation vanish for the price of a timestamp.
- Add skills and feel dispatch turn into matching — the moment this stops being a toy.
- Next in the queue: the thread pool — this exact skeleton with identical workers, where the queue's patience is measured in milliseconds.
The desk agent who calls "next!" the instant a customer walks away — that reflex, encoded in pullFromQueues, is the entire difference between a call center and a phone graveyard.