Google Calendar DB Design: Store the Recurrence Rule, Not a Million Rows
A database deep dive on modelling Google Calendar: storing a recurrence rule instead of infinite instances, expanding on read, cancellations and overrides, timezones/DST, and time-range queries.
"Design the database for Google Calendar." A calendar is just events with a start and end — until someone makes one repeat. "Stand-up every weekday, 9am, forever" is a single thing a user created once, but it has infinite occurrences. Store them all and your table is unbounded before lunch. Store none and "what's on my calendar Tuesday?" has nothing to read. The entire design lives in that tension, plus its two cruel follow-ups: someone cancels one occurrence, someone moves another, and daylight saving time quietly shifts everything by an hour.
So the one decision that makes or breaks this schema: store the recurrence rule, not the instances, and compute the instances on read. Get that right and a calendar holding "forever, weekly" is a single row.
Let's start nowhere near a computer
A gym pins one sheet to the wall: "Spin class — every Monday & Wednesday, 7am." It does not keep a ledger with a line for every Monday from now until the heat death of the universe. The sheet is the rule; to answer "what's on this week?" you read the rule and work out this week's dates in your head.
When a class is cancelled for a public holiday, the gym doesn't rewrite the sheet — it sticks a small note: "no class Dec 25." When one week's class moves to Tuesday, another small note: "this week, Tue not Mon." The sheet stays one line; the exceptions are tiny addenda.
That's the whole model: the wall sheet is a recurrence rule stored in one row, "this week's dates" is expansion on read, and the sticky notes are exception rows — one to cancel an occurrence, one to move it. Everything below is making those three things precise.
Where this exact shape shows up
- Google / Outlook / Apple Calendar — recurring events are exactly this (the RFC 5545
RRULE,EXDATE, and override standard). - Cron jobs, billing schedules, shift rosters, class timetables — any "repeats on a rule, with the odd exception" data.
- The alarm/scheduler is the in-memory cousin: a rule and a next-fire time instead of a stored series.
Step 1 — The access patterns (functional requirements)
For a schema, "functional requirements" means the queries it must serve. Name them first; the model is whatever makes them cheap.
- Create an event — one-off or recurring (a rule).
- Show a time range — "my events from Monday to Sunday," recurring ones expanded into concrete instances.
- Cancel one occurrence of a series without touching the rest.
- Move/edit one occurrence (different time, title) without touching the rest.
- Edit the whole series going forward.
- Invite attendees and track RSVPs; set reminders.
The two that bite are "show a time range" (recurring events have no rows to scan) and "change one occurrence" (you must not fork the whole series). Everything below serves those.
Step 2 — Non-functional requirements
What separates a toy from Google Calendar is how well it serves those under real data.
- Bounded storage. A "forever" event is one row, not infinite rows. Non-negotiable.
- Fast time-range reads. "Show my week" is the hot query; it must be an indexed lookup plus a little expansion, not a scan.
- Correctness of recurrence and time. Instances land on the right dates, exceptions apply exactly once, and DST never shifts a 9am meeting to 8am.
- Eventual consistency is fine across devices. A reminder or a shared-calendar update lagging a few seconds is acceptable; the event you just created must be visible to you immediately.
Listing them is the easy half; the schema only earns them if it fulfills them:
| Requirement | How this design fulfills it |
|---|---|
| Bounded storage | one row per series — a rule, never materialized instances — Steps 3, 4 |
| Fast time-range reads | index on (calendar_id, start); fetch the few candidate series, then expand — Step 6 |
| Recurrence correctness | expansion walks the rule; EXDATE and override rows apply per-occurrence — Steps 4, 5 |
| Time / DST correctness | store local time + timezone; compute each instance's UTC at expansion — Step 4 |
| Read-your-writes | your own create/edit is in your view immediately; shares/reminders may lag — Step 9 |
Every trade-off below is chosen to keep one of these.
Step 3 — The modeling decision: store the rule
One row holds a whole series: its first occurrence, a duration, and a recurrence rule (in real systems, an RFC 5545 RRULE string like FREQ=WEEKLY;BYDAY=MO). A one-off event simply has a null rule.
package dev.fiveyear.calendar;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.util.Map;
import java.util.Set;
/** How often a series repeats. `until` is inclusive; null means "forever" (bounded
* on read by the query window, never materialized). */
enum Freq { DAILY, WEEKLY }
record Recurrence(Freq freq, int interval, LocalDate until) {}
/**
* One stored row for a whole series: the first occurrence, a duration, and a rule.
* `cancelled` are EXDATEs (a single deleted instance); `moved` overrides an
* instance's start (a single rescheduled instance). A one-off event has rule null.
*/
record Event(
String id,
LocalDateTime start,
int durationMin,
Recurrence rule,
Set<LocalDateTime> cancelled,
Map<LocalDateTime, LocalDateTime> moved) {}Which datastore — and why it isn't a default. This is a relational schema, and earns it: the hot query is a range over (calendar_id, start), events join to attendees and reminders, and editing one occurrence is a small transactional write to an exceptions table. All relational strengths. The rrule is just a text column; the event_exceptions table carries one row per cancelled or moved occurrence (event_id, original_start, status, new_start). The tempting alternative — materializing every occurrence as a row so reads are "simple" — is the trap the whole article exists to avoid: it makes "forever" unbounded and turns one edit-the-series into a million-row update. Store the rule; keep the table small.
Step 4 — Expansion on read
The series is computed into concrete instances only when a query asks for a window — walk the rule from the start, stop at the window's end (so even a "forever" rule terminates), skip cancellations, and apply moves.
package dev.fiveyear.calendar;
import java.time.LocalDateTime;
import java.util.ArrayList;
import java.util.List;
/**
* Expands a recurring event into concrete instances within a query window. The
* series is stored as ONE row plus a rule; instances are computed on read, never
* materialized — that's what makes "every Monday, forever" storable, and what a
* "show me this week" query actually runs.
*/
public class RecurrenceExpander {
public record Instance(String eventId, LocalDateTime start, LocalDateTime end) {}
public List<Instance> expand(Event e, LocalDateTime from, LocalDateTime to) {
List<Instance> out = new ArrayList<>();
if (e.rule() == null) { // one-off event
addIfOverlaps(out, e, e.start(), from, to);
return out;
}
Recurrence r = e.rule();
LocalDateTime cursor = e.start();
while (!cursor.isAfter(to)) { // window end bounds even a "forever" rule
if (r.until() != null && cursor.toLocalDate().isAfter(r.until())) break;
if (!e.cancelled().contains(cursor)) { // EXDATE: a deleted instance is skipped
LocalDateTime actual = e.moved().getOrDefault(cursor, cursor); // override: a moved instance
addIfOverlaps(out, e, actual, from, to);
}
cursor = step(cursor, r);
}
return out;
}
private void addIfOverlaps(List<Instance> out, Event e, LocalDateTime start,
LocalDateTime from, LocalDateTime to) {
LocalDateTime end = start.plusMinutes(e.durationMin());
if (start.isBefore(to) && end.isAfter(from)) { // [start,end) overlaps [from,to)
out.add(new Instance(e.id(), start, end));
}
}
private LocalDateTime step(LocalDateTime t, Recurrence r) {
return switch (r.freq()) {
case DAILY -> t.plusDays(r.interval());
case WEEKLY -> t.plusWeeks(r.interval());
};
}
}Notice the type: LocalDateTime, not an absolute UTC instant. A weekly "9am"
meeting must stay 9am local across a daylight-saving change — so you store
the local time plus the event's timezone and resolve each instance's UTC
during expansion. Store a fixed UTC instant for the series and every
occurrence after the DST boundary silently slips by an hour. Time is the
second-hardest thing in this schema, right after recurrence itself.
Two correctness notes worth saying out loud: walking from start is fine for a teaching example, but production code jumps straight to the first occurrence on or after the window (arithmetic, not a loop); and the window's end is what makes a "forever" series safe to expand at all.
Step 5 — Exceptions: cancel one, move one
The payoff of storing a rule: changing a single occurrence is a tiny exception row, and the series stays one line.
- Cancel ("no stand-up Friday") → an
EXDATE: a row marking that occurrence deleted. Expansion skips it. - Move/edit ("this week, 2pm not 9am") → an override: a row mapping the original occurrence's start to a new start (and any changed fields). Expansion substitutes it.
- Edit the whole series going forward → split the series at the edit point: end the old rule with an
UNTIL, create a new series from there. (This is whyuntilexists.)
The rule is never rewritten for a one-off change. That's the difference between a calendar that scales and one that forks a million rows every time you drag a meeting.
Step 6 — The time-range read path
"Show my week" can't scan instances that don't exist, so it works in two moves: fetch the candidate series, then expand each in the window.
A series is a candidate for window [from, to) if it could possibly produce an instance there: a one-off whose start falls in range, or a recurring series that began before to and hasn't ended before from (its until is null or ≥ from). The index on (calendar_id, start) makes that a cheap range fetch — typically a handful of series — and only those get expanded. You never touch the whole table, and you never expand a series the window can't contain.
Step 7 — Trade-offs (each one keeping an NFR)
| Decision | The tempting alternative | Why ours wins | Keeps |
|---|---|---|---|
| store the rule, expand on read | materialize every occurrence | "forever" is one row; editing the series is one write | bounded storage |
EXDATE + override rows | fork the series on any change | a single change is a tiny row; the series stays intact | storage / correctness |
| local time + timezone | one fixed UTC instant per series | 9am stays 9am across DST instead of slipping an hour | time correctness |
index (calendar_id, start) | scan and filter by date | the hot "my week" query fetches only candidate series | read latency |
UNTIL-split for "this & future" | edit every future instance | one rule ends, one begins — bounded work | bounded storage |
The complete implementation
The model and expander are the engine. Here's the driver that proves it — a weekly series, a cancellation, a move, an UNTIL bound, and one-off events in and out of the window:
package dev.fiveyear.calendar;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class Main {
static final LocalDateTime JUN1 = LocalDateTime.of(2026, 6, 1, 9, 0); // a Monday
static final LocalDateTime FROM = LocalDateTime.of(2026, 6, 1, 0, 0);
static final LocalDateTime TO = LocalDateTime.of(2026, 7, 1, 0, 0);
public static void main(String[] args) {
RecurrenceExpander ex = new RecurrenceExpander();
// weekly Mondays 09:00, 60 min, forever
Event weekly = new Event("w", JUN1, 60, new Recurrence(Freq.WEEKLY, 1, null), Set.of(), Map.of());
List<RecurrenceExpander.Instance> all = ex.expand(weekly, FROM, TO);
assertTrue(all.size() == 5, "5 Mondays in June 2026 (got " + all.size() + ")");
// cancel June 15 (EXDATE)
Event withExdate = new Event("w", JUN1, 60, new Recurrence(Freq.WEEKLY, 1, null),
Set.of(LocalDateTime.of(2026, 6, 15, 9, 0)), Map.of());
List<RecurrenceExpander.Instance> minus = ex.expand(withExdate, FROM, TO);
assertTrue(minus.size() == 4, "cancellation drops one instance");
assertTrue(minus.stream().noneMatch(i -> i.start().getDayOfMonth() == 15), "June 15 is gone");
// move June 8 09:00 -> June 9 14:00 (override)
Event withMove = new Event("w", JUN1, 60, new Recurrence(Freq.WEEKLY, 1, null),
Set.of(), Map.of(LocalDateTime.of(2026, 6, 8, 9, 0), LocalDateTime.of(2026, 6, 9, 14, 0)));
List<RecurrenceExpander.Instance> moved = ex.expand(withMove, FROM, TO);
assertTrue(moved.size() == 5, "a move keeps the count");
assertTrue(moved.stream().anyMatch(i -> i.start().equals(LocalDateTime.of(2026, 6, 9, 14, 0))), "moved instance present");
assertTrue(moved.stream().noneMatch(i -> i.start().equals(LocalDateTime.of(2026, 6, 8, 9, 0))), "original slot vacated");
// UNTIL June 16 -> only Jun 1, 8, 15
Event bounded = new Event("w", JUN1, 60,
new Recurrence(Freq.WEEKLY, 1, LocalDate.of(2026, 6, 16)), Set.of(), Map.of());
assertTrue(ex.expand(bounded, FROM, TO).size() == 3, "UNTIL caps the series at 3");
// one-off event inside and outside the window
Event once = new Event("o", LocalDateTime.of(2026, 6, 10, 12, 0), 30, null, Set.of(), Map.of());
assertTrue(ex.expand(once, FROM, TO).size() == 1, "one-off inside the window");
Event onceOut = new Event("o", LocalDateTime.of(2026, 8, 1, 12, 0), 30, null, Set.of(), Map.of());
assertTrue(ex.expand(onceOut, FROM, TO).isEmpty(), "one-off outside the window excluded");
System.out.println("ALL CALENDAR ASSERTIONS PASSED");
}
static void assertTrue(boolean c, String m) { if (!c) throw new AssertionError(m); }
}Step 8 — Scaling the design, one bottleneck at a time
- Reads dominate ("show my week", every device, often) → cache a user's expanded window and add read replicas; the rule rows are small and cache well.
- Per-user data grows → shard by
calendar_id/ user; every query is already scoped to a calendar, so a user's events live on one shard — no cross-shard expansion. - Expansion cost → it's CPU on read, so cache expanded windows and, for very long ranges, cap how far a "forever" rule expands per request (paginate by week/month).
- Reminders / notifications → a separate scheduled-job system reads the next occurrences; it doesn't belong on the read path.
The headline: this scales by sharding per calendar and caching expansions, never by materializing instances.
Step 9 — When data goes wrong
A schema's "failures" are the ways its data drifts, sorted like any dependency: an optimization that can go stale, a truth that must hold, and a race to close.
- A cached expansion goes stale (an event changed after you cached the week). It's an optimization — invalidate the user's window on any write to their calendar, and treat a brief staleness as acceptable. Never block a read on recomputation.
- An exception points at an occurrence the rule no longer produces (someone edited the series so the original slot is gone). The truth must stay consistent: an override or
EXDATEwhoseoriginal_startisn't a valid occurrence is simply ignored on expansion — it can't corrupt the series. - Two edits race on the same series (cancel an occurrence while moving it). Exceptions are keyed by
(event_id, original_start), so the writes target the same row and the database serializes them — last writer wins on that one occurrence, the rest of the series untouched. - A timezone database update shifts a historical instant → store wall-clock local time, not a baked offset, so the rule re-resolves correctly under the new rules instead of freezing a wrong UTC.
The interview corner
- "How do you store a recurring event?" One row with a recurrence rule (
RRULE), never materialized instances. Expand on read within the query window. - "How do you cancel or move one occurrence?" An exception row — an
EXDATEto delete, an override to move — keyed by the occurrence's original start. The series stays one row. - "How does 'show my week' work?" Index
(calendar_id, start), fetch candidate series (one-offs in range + recurring not yet ended), expand each in the window, merge. - "What about daylight saving?" Store local time + timezone and resolve UTC per instance at expansion. A fixed UTC instant for the series drifts an hour across DST.
- "Edit this and all future events?" Split the series:
UNTILthe old rule at the edit point, start a new rule from there. Bounded work, not a mass update.
Where to go from here
- New to data-modelling deep dives? See the Reddit comments schema for trees, and the rookie's guide to HLD for the method.
- The scheduling cousin in memory is the alarm / scheduler service — a rule and a next-fire time instead of a stored series.