LLDbeginner

Logging Library LLD: Build log4j's Skeleton in One Sitting

A low-level design walkthrough of a logging library: ordered levels, a cheap threshold gate, formatters, appenders as a Strategy, and the complete implementation of the log4j shape.

By fiveyearsdevJune 12, 202610 min read

"Design a logging library — something like log4j." This question is beloved because every engineer has used a logger thousands of times, and almost nobody has thought about why it's shaped the way it is. Why are levels ordered instead of being independent flags? Why is the formatter a separate thing from the appender? Why is a disabled log statement nearly free?

Phase 2 of the queue starts here — we're leaving the games shelf for the systems-and-libraries shelf, where the user of your design is another programmer. That changes the grading: the API itself is now the product.

Let's start nowhere near a computer

Think of a newsroom tip line. Calls come in all day — most are trivia, some are leads, a few are emergencies. The desk runs three separate decisions, in a fixed order: the screener decides if a call is even worth attention today ("we're only taking leads and up"); the writer turns an accepted call into a standard memo — time, caller, urgency, message; the dispatcher copies that memo to whoever subscribed — the editor's tray, the archive, the night pager.

Screen → write → dispatch. Three different jobs, three different reasons to change. The newsroom never lets the dispatcher decide importance, and never lets the screener write memos. Hold that separation — it's the whole library.

You use this design every day

Every framework you've touched — log4j, slf4j, JUL, Winston, Python's logging — is this exact pipeline with different names.
The level question is everywhere: alert severities, alarm states, HTTP error classes — ordered severities beat boolean flags every time.
The appender idea is the Strategy pattern's most-deployed instance on Earth: one interface, a dozen destinations.

Step 1 — Functional requirements (sentences first)

What the library must do, as plain sentences — the functional requirements.

A logger accepts messages at a level: debug, info, warn, or error.
Messages below the configured threshold are dropped — as cheaply as possible.
Accepted messages are formatted into a line: timestamp, level, logger name, message.
The line is delivered to every configured destination: console, file, wherever.
Destinations, levels, and formats can change without touching calling code.

That second sentence is a deliberate stance, not a footnote. The whole reason a logger is shaped this way is that the cheapest thing it does — deciding a line isn't worth keeping — is the thing it does most. Design the disabled path first; everything else is downstream of it.

Step 2 — Non-functional requirements

At class level the non-functional requirements are different words for the same idea — how well, not just what — and for a logging library they are the design:

Low overhead on the caller. A suppressed-level log must be nearly free, and even an accepted one must not block the application thread — this is the point of the whole exercise. A logger that makes your business logic slow has failed at its one job.
Thread-safety. Many threads log concurrently into the same shared appenders; the design must never interleave half-lines or corrupt a destination.
Ordering & throughput. Lines should land in roughly the order they were emitted, and the expensive part — I/O — should batch off the hot path rather than stall each caller in turn.
Extensibility. New appenders, formatters, and even levels plug in through interfaces — the open/closed seam that lets a destination change without a recompile of calling code.

Listing them is the easy half; the design only earns them if it fulfills them. Here's the contract — each requirement and the mechanism that keeps it:

Requirement	How this design fulfills it
Low overhead on caller	the ordered-level gate returns on one `int` compare before any formatting or I/O — Step 3
Thread-safety	appenders are the only shared state; each owns its own synchronization, the Logger adds no locks — Step 5
Ordering / throughput	the format-once pipeline, then an async appender that batches I/O off the caller's thread — Steps 4, 6
Resilience (appender throws)	the fan-out loop isolates each appender, so one bad destination can't kill logging — Step 5
Extensibility	`Formatter` and `Appender` are interfaces; new strategies drop in unseen by callers — Step 5

Every trade-off below is chosen to keep one of these.

Step 3 — Levels are an order, not a set

The rookie models levels as independent booleans (debugEnabled, warnEnabled…). But severity is inherently ordered — anyone who wants warnings wants errors too. An enum's declaration order does all the work:

Level.java

public enum Level { DEBUG, INFO, WARN, ERROR }   // declaration order = severity order

the gate

if (level.ordinal() < threshold.ordinal()) {
    return;        // dropped: one integer comparison, no formatting, no I/O
}

That early return is the most important line in the library. A log.debug(…) in a hot loop with threshold INFO costs one comparison — which is why teams can leave diagnostic logging in production code. Formatting before gating is the classic performance bug this question exists to catch.

Which structure — and why. Severity is an ordered enum, not a set of independent booleans, precisely because the cheap >= compare on ordinal() is the low-overhead NFR — one integer comparison suppresses a line before any allocation, formatting, or I/O happens. Booleans would force the caller to know which flags to check and couldn't express "warn and everything worse" in a single test; the order encodes the severity relationship for free. The choice isn't aesthetic — it's the mechanism that makes a disabled log nearly free.

The follow-up interviewers love: log.debug("state: " + bigObject) still pays for the string concatenation before the gate runs. That's why real libraries offer log.debug("state: {}", bigObject) — lazy formatting — and why slf4j's placeholder API exists at all. Name it.

Step 4 — The pipeline: gate, format, fan out

Three stages, mapped straight from the newsroom:

Logger.java (the pipeline)

public void log(Level level, String message) {
    if (level.ordinal() < threshold.ordinal()) {
        return;                                          // 1. screen (cheap!)
    }
    LogEvent event = new LogEvent(clock.instant(), level, name, message);
    String line = formatter.format(event);               // 2. write the memo — ONCE
    for (Appender appender : appenders) {
        appender.append(line);                           // 3. dispatch to every tray
    }
}

Two structural decisions worth narrating. The event is a record, not a string — formatting is somebody else's job, so the raw facts travel intact. And the line is formatted once, outside the appender loop — three destinations don't pay for three formats.

And there's our old friend from the games series: the clock is injected. A library that calls Instant.now() internally can never have its output asserted in a test.

Step 5 — Appenders: the Strategy that pays your salary

Where do log lines go? Console in development, files in production, a socket to a collector, memory in tests. The destination varies; the cheat sheet says Strategy:

Appender.java + two strategies

public interface Appender {
    void append(String line);
}
 
public final class ConsoleAppender implements Appender {
    @Override
    public void append(String line) {
        System.out.println(line);
    }
}
 
/** The unsung hero: how logging libraries test THEMSELVES. */
public final class MemoryAppender implements Appender {
    private final List<String> lines = new ArrayList<>();
 
    @Override
    public void append(String line) {
        lines.add(line);
    }
 
    public List<String> lines() {
        return List.copyOf(lines);
    }
}

The MemoryAppender is the design's quiet proof: because destinations hide behind an interface, the test double is four lines — and suddenly every claim in this article is assertable.

Two NFRs converge on this loop. The appenders are the only shared mutable surface in the library — the Logger itself holds no other lock-worthy state — so thread-safety reduces to "each appender guards its own destination" (a ConsoleAppender synchronizing its writes, a file appender owning its stream); many threads can fan a line out concurrently without the Logger inventing a lock. And resilience: an appender that throws — a full disk, a closed socket — must not take down logging for every other destination, nor crash the calling business thread. The fan-out loop isolates each append call (catch, count, carry on), exactly the way the worker loop in the thread pool wraps each task so one failure can't shrink the crew.

The senior follow-up: appenders do I/O, and I/O on the caller's thread means a slow disk makes your business logic slow. Real libraries wrap appenders in a queue + background writer (an AsyncAppender — the producer-consumer pattern, verbatim). Offer it as the v2; don't build it unasked.

Step 6 — The async appender: a bounded queue that drops

Here's where the low-overhead NFR stops being a one-line gate and becomes a structural decision. A ConsoleAppender or file appender does I/O on the calling thread — and a slow disk or a blocked socket then makes your business logic slow. The fix is the producer-consumer shape: the AsyncAppender is itself an Appender (it plugs into the same chain — that's the extensibility seam paying off) that does nothing but hand the line to a bounded queue and return immediately. One background thread drains the queue and does the real I/O, batching it off the hot path. The caller pays an enqueue and walks away; the ordering & throughput NFR is kept because a single drainer writes in FIFO order while the application threads never wait on the disk.

The load-bearing decision is what happens when the queue is full. Two policies, and they are opposites:

Block the caller until a slot frees — which reintroduces the exact stall we built the async appender to avoid. Rejected: it sacrifices the one NFR that justifies the whole component.
Drop on overflow — the line is discarded and a droppedCount counter ticks up, so the loss is visible (emit it as its own log line when the queue drains). The application thread never blocks.

We drop. This is the deliberate contrast with the thread pool, which faces the same bounded queue and chooses the opposite policy — it rejects loudly, because a lost task is a lost unit of work the caller is responsible for. A lost log line is survivable; a logger that freezes your request thread to guarantee delivery has its priorities backwards. Same bounded queue, opposite overload policy — and "what do we sacrifice under load, fidelity or honesty?" is answered per use case. We choose to sacrifice fidelity; the thread pool refuses to.

Three failures to name before the interviewer does:

Queue full → drop + count. Don't drop silently. The dropped-count counter (and a single "dropped N lines" line when pressure eases) turns invisible loss into an observable signal — the difference between a library you can operate and one you can't.
An appender that throws must not kill logging. The drainer wraps each append in a try/catch (Step 5's isolation, now on the background thread); a closed file handle costs one swallowed exception, not a dead logging subsystem.
Formatting a null or huge message. A null message formats to a literal "null" rather than throwing inside the gate; a multi-megabyte message is the caller's mistake, but the library shouldn't compound it — truncation at a sane cap is the defensible default, named even if not built.

The complete implementation

Level.java + LogEvent.java + interfaces

package dev.fiveyear.logging;
 
import java.time.Instant;
 
public enum Level { DEBUG, INFO, WARN, ERROR }
 
public record LogEvent(Instant timestamp, Level level, String logger, String message) {}
 
public interface Formatter {
    String format(LogEvent event);
}
 
public interface Appender {
    void append(String line);
}

LineFormatter.java

package dev.fiveyear.logging;
 
public final class LineFormatter implements Formatter {
 
    @Override
    public String format(LogEvent e) {
        return e.timestamp() + " [" + e.level() + "] " + e.logger() + " — " + e.message();
    }
}

Logger.java

package dev.fiveyear.logging;
 
import java.time.Clock;
import java.util.List;
 
public final class Logger {
 
    private final String name;
    private final Level threshold;
    private final Formatter formatter;
    private final List<Appender> appenders;
    private final Clock clock;
 
    public Logger(String name, Level threshold, Formatter formatter,
                  List<Appender> appenders, Clock clock) {
        this.name = name;
        this.threshold = threshold;
        this.formatter = formatter;
        this.appenders = List.copyOf(appenders);
        this.clock = clock;
    }
 
    public void log(Level level, String message) {
        if (level.ordinal() < threshold.ordinal()) {
            return;
        }
        LogEvent event = new LogEvent(clock.instant(), level, name, message);
        String line = formatter.format(event);
        for (Appender appender : appenders) {
            appender.append(line);
        }
    }
 
    public void debug(String message) { log(Level.DEBUG, message); }
    public void info(String message)  { log(Level.INFO, message); }
    public void warn(String message)  { log(Level.WARN, message); }
    public void error(String message) { log(Level.ERROR, message); }
}

The newsroom in action:

Demo.java

MemoryAppender memory = new MemoryAppender();
Logger log = new Logger("checkout", Level.INFO, new LineFormatter(),
        List.of(new ConsoleAppender(), memory), Clock.systemUTC());
 
log.debug("cart recalculated");   // below threshold: one comparison, then gone
log.info("order 4012 placed");    // formatted once, delivered twice
log.error("payment gateway timed out");
 
// memory.lines() — two lines, fully assertable, because the clock
// and the destination are both injectable

Step 7 — Trade-offs (each one keeping an NFR)

The last column is the discipline: every choice keeps one of the promises from Step 2 — that's what designing to the non-functional requirements looks like.

Decision	The tempting alternative	Why ours wins	Keeps
ordered enum + `>=` gate	independent boolean flags	a suppressed line costs one `int` compare — no allocation, no format	low overhead on caller
format once, outside the loop	format per appender	three destinations don't pay for three formats; the line is built once	ordering / throughput
async appender, drop on overflow	block the caller until a slot frees	the application thread never stalls on a slow disk — drops are counted, not silent	low overhead on caller
isolate each `append` in try/catch	let appender exceptions propagate	one broken destination can't kill logging or crash the business thread	resilience
`Appender` / `Formatter` interfaces	a switch over destination types	new sinks and formats drop in without touching the Logger or its callers	extensibility

The interview corner

Clarify before you code: Is asynchronous delivery acceptable (almost always yes)? Text lines or structured JSON? Do different packages need different thresholds?

The follow-up ladder:

"Configure per package." A logger hierarchy: app.checkout walks up the dotted name to inherit app's threshold — the heart of real log4j configuration, one prefix walk.
"The message is expensive to build." Lazy logging: an overload taking a supplier of the message — the lambda runs only past the gate. This is the slf4j question.
"Don't block my requests on disk." The AsyncAppender: a bounded queue plus one writer thread — then answer the question that actually matters: queue full → drop DEBUG, or block callers? Losing logs vs slowing the app is a business choice; name both sides.
"Files grow forever." Rolling policies (by size, by date) are a Strategy wrapped around the FileAppender — rotation never touches the Logger.
"Trace one request across a thousand lines." MDC: a per-thread context map (request id, user id) merged into every event; the formatter prints it. With it, grep becomes a debugger.

Mistakes that fail the round: formatting before the gate; treating appender I/O as free; one global mutable configuration that tests fight over.

Where to go from here

Pocket version: levels are an order, the gate runs before any work, the event is data, the format happens once, and destinations are strategies — with the test double free.

Add a logger hierarchy — app.checkout inheriting app's threshold; it's a name-prefix walk and the heart of real log4j configuration.
Build the AsyncAppender — a BlockingQueue and one consumer thread, straight from the multithreading article's producer-consumer.
Next in the queue: the JSON parser, where the library being designed has to understand text rather than emit it.

Next time you write log.info(…) without a thought, you'll see the newsroom behind it: a screener, a writer, a row of trays — and one integer comparison standing guard over your latency.

LLD