DSAbeginner

The Trie: How Autocomplete Finishes Your Sentences

A friendly, diagram-first tour of the trie (prefix tree), the data structure behind autocomplete: insert, search, and prefix lookups step by step, plus a complete implementation.

By fiveyearsdevJune 11, 202610 min read

You open Google, type how to ma, and before your finger reaches the next key it's already offering how to make paneer. Your phone keyboard does the same trick. So does your contacts app when you type two letters and your friend's name jumps to the top.

That trick has a name. It's a data structure called a trie (also called a prefix tree), and it's one of the friendliest structures you'll ever build. By the end of this article, you and I will have built one together, step by step — and you'll know exactly why autocomplete feels instant.

The name comes from retrieval, so technically it's pronounced "tree" — but almost everyone says "try" so it doesn't get confused with a regular tree. Say "try" in your interview and nobody will blink.

Let's start nowhere near a computer

Imagine I hand you a printed dictionary — about 170,000 words — and ask you to check whether "cattle" is a real word.

You don't start at page 1 and read every word. You'd be there all week. Instead, you flip to the C section, then narrow to CA, then CAT, and suddenly you're choosing between a handful of words instead of 170,000.

All ~170,000 English words
└── words starting with 'c'      → ~10,000 left
    └── words starting with 'ca'  → ~1,000 left
        └── words starting with 'cat' → a handful left

Notice what you did: every letter you read threw away almost everything that couldn't match. You never paid attention to the words starting with 'b' or 'd' — they simply stopped existing for you.

Here's another place you already use this idea. Think about how a parcel reaches your home:

India ── Karnataka ── Bengaluru ── MG Road ──┬── House 12
                                             └── House 14

The courier doesn't plan a separate route from scratch for house 12 and house 14. Both parcels share the same road for as long as their addresses agree, and split only at the very last step.

That's the entire trie in one sentence: words that share a beginning share a path. You can see it below: store cat, car, can, and dog, and the three c-words all ride the same ca spine before they split — just like the parcels sharing a road until the last turn.

Everything else is just code.

The same trick, all over your day

Once you know the shape, you'll spot it everywhere — the search bar walking the letters you've typed and showing what hangs below; your phone keyboard offering tomorrow after tom; your IDE turning str into String/StringBuilder/strip(); the spell-checker's red squiggle, which appears the instant your letters walk off the edge of the trie (no path for them = not a word).

But the most illuminating cousin uses no letters at all. Internet routers forward every packet by longest-prefix match on the destination IP — a trie over the address's bits. A routing table says, in effect, "anything under 192.168.* exits left, but 192.168.5.* specifically exits right." The router walks the address bit by bit, and the deepest matching node wins, because a longer prefix is a more specific route.

It's the same descend-by-pieces idea with two twists. The alphabet shrinks to {0, 1}, and the question changes from "is this word here?" to "what's the longest stored prefix of this?" Those twists are why routers use a compressed (radix/Patricia) trie that collapses single-child chains, so a 32-bit address doesn't cost 32 hops. Same skeleton, different alphabet, a richer query — that one generalization, trie-over-bits with longest-match-wins, teaches you more than another five "you've seen this" examples ever could.

What a trie actually looks like

Let's store five words: cat, car, card, care, dog. Drawn as a trie, they look like this (a * means "a word ends here"):

        root
       /    \
      c      d
      │      │
      a      o
     / \     │
    t*  r*   g*
       / \
      d*  e*

Read any path from the root downward and you spell a word: root → c → a → t spells cat. Root → c → a → r → e spells care.

Two things to notice, because they're the heart of the whole structure:

Shared beginnings are stored once. Cat, car, card, care all share c → a, so those letters exist exactly once — one road, four destinations.
The stars matter. The path c → a → r has a star, so car is a word. The path c → a has no star — ca is a prefix you pass through, not a word anyone inserted.

Here's the mental shift that makes tries click: a node never stores its own letter. The letters live on the edges — the choices you make on the way down. A node only knows two things: where you can go next, and whether a word ends at this exact spot.

Let's build one, step by step

Time to turn the picture into code. We'll build it in small pieces, and at the end I'll hand you the complete, assembled class.

Step 1: the node

Every box in our diagram needs exactly two things: its outgoing edges, and a flag for the star.

┌──────────────────────────────────┐
│             Node                 │
├──────────────────────────────────┤
│ children: [a][b][c] ... [z]      │ ← 26 slots, one per letter
│ isEndOfWord: true / false        │ ← is there a star here?
└──────────────────────────────────┘

We'll keep things simple and assume lowercase English letters, so an array of 26 child slots does the job. Slot 0 is 'a', slot 1 is 'b', and so on — ch - 'a' converts a letter to its slot.

Trie.java (the node)

private static final class Node {
    private final Node[] children = new Node[26];
    private boolean isEndOfWord;
}

That's genuinely the whole node. A trie is thousands of these little boxes holding hands.

Step 2: insert — lay the road as you walk

To insert a word, start at the root and walk one letter at a time. If the road you need already exists, follow it. If it doesn't, build it and then follow it. When you place the last letter, plant a star.

Watch insert("cat") happen on an empty trie:

start        after 'c'    after 'a'    after 't'
─────        ─────────    ─────────    ─────────
root         root         root         root
              │            │            │
              c            c            c
                           │            │
                           a            a
                                        │
                                        t*  ← star the last node

Now insert "car" into the same trie — and watch how little work it takes:

walking 'c'... already exists, just follow it
walking 'a'... already exists, just follow it
walking 'r'... missing! create it, then star it
 
        root
         │
         c
         │
         a          ← "car" reused this entire path for free
        / \
       t*  r*

The first two letters cost us nothing — the road was already built when cat moved in. That's the courier's shared road, in code:

Trie.java (insert)

public void insert(String word) {
    Node current = root;
    for (char ch : word.toCharArray()) {
        int index = ch - 'a';
        if (current.children[index] == null) {
            current.children[index] = new Node();  // build the missing road
        }
        current = current.children[index];          // ...and walk down it
    }
    current.isEndOfWord = true;                     // plant the star
}

Step 3: search — walk and check the star

Searching is the same walk, but stricter. There are exactly two ways to fail, and telling them apart is what interviewers love to probe:

search("card")   root → c → a → r → d     'd' has a star   → FOUND ✓
search("ca")     root → c → a             no star at 'a'   → not a word ✗
search("cow")    root → c → o?            'c' has no 'o'   → path breaks ✗

If the path breaks mid-walk, the word was never inserted.
If the path survives but there's no star at the end, you've found a prefix, not a word. Ca is on the way to cat, but nobody ever inserted ca.

Trie.java (search)

public boolean search(String word) {
    Node node = walk(word);
    return node != null && node.isEndOfWord;  // survived AND starred
}
 
// Follows the path letter by letter; returns null if it breaks.
private Node walk(String chars) {
    Node current = root;
    for (char ch : chars.toCharArray()) {
        current = current.children[ch - 'a'];
        if (current == null) {
            return null;
        }
    }
    return current;
}

Step 4: startsWith — the question tries were born for

"Does anything I've stored start with ca?" For a trie, this is the easiest question in the world: walk the prefix, and if you're still standing on a node, the answer is yes. You don't care about stars — surviving the walk is enough, because a node only exists if some word built it.

Trie.java (startsWith)

public boolean startsWith(String prefix) {
    return walk(prefix) != null;   // same walk, no star check
}

Pause on this for a second, because it's the trie's superpower. A HashSet can answer search("car") just as fast as we can. But ask a HashSet "does anything start with ca?" and it has no choice but to check every single word it holds. The trie just takes two steps and looks around.

Step 5: autocomplete — the party trick

Here's where it all pays off. When you type ca into a search box, what does the engine actually do?

It walks down c → a... and then simply collects everything hanging below.

you typed "ca" ──► walk root → c → a, then gather the subtree:
 
         (a)  ← you are standing here
        /   \
       t*    r*          t* → "cat"
            /  \         r* → "car"
           d*   e*       d* → "card"
                         e* → "care"
 
suggestions: cat, car, card, care

Every word that starts with ca lives below that node — that's not luck, it's the definition of the structure. The collection itself is a depth-first walk that keeps a running StringBuilder of the path so far:

Trie.java (autocomplete)

public List<String> autocomplete(String prefix) {
    List<String> results = new ArrayList<>();
    Node start = walk(prefix);
    if (start != null) {
        collect(start, new StringBuilder(prefix), results);
    }
    return results;
}
 
private void collect(Node node, StringBuilder prefix, List<String> results) {
    if (node.isEndOfWord) {
        results.add(prefix.toString());      // a star → a finished word
    }
    for (int i = 0; i < 26; i++) {
        if (node.children[i] != null) {
            prefix.append((char) ('a' + i)); // step down...
            collect(node.children[i], prefix, results);
            prefix.deleteCharAt(prefix.length() - 1); // ...and step back up
        }
    }
}

That append / deleteCharAt pair is classic backtracking: extend the path going down, undo it coming back, so the builder always spells exactly the path from the root to wherever you're standing.

Step 6: delete — the interview favorite

Deleting is where most candidates wobble, so let's get the rules straight. Deleting car must not damage card or care — they're still using that road.

The safe order of operations:

Walk to the end of the word and remove the star. For many cases, you're done — the word is gone, the road stays for others.
On the way back up, unhook a node only if it has no children and no star of its own. That means no other word needs it.

delete("card")                        then delete("car")
 
      a                                     a
     / \                                   / \
    t*  r*  ← still a word, keep it       t*  r   ← just remove the star;
        │                                     │     'r' still leads to "care",
        d*  ← no children, no other           e*    so the node must stay
            word needs it → unhook

The "may I unhook myself?" question travels back up the recursion as a boolean:

Trie.java (delete)

public void delete(String word) {
    delete(root, word, 0);
}
 
// Returns true if the parent may unhook its reference to this node.
private boolean delete(Node node, String word, int depth) {
    if (node == null) {
        return false;                        // word was never here
    }
    if (depth == word.length()) {
        if (!node.isEndOfWord) {
            return false;                    // prefix exists, word doesn't
        }
        node.isEndOfWord = false;            // remove the star
        return hasNoChildren(node);          // unhook only if nobody's below
    }
    int index = word.charAt(depth) - 'a';
    if (delete(node.children[index], word, depth + 1)) {
        node.children[index] = null;         // child said it's safe — unhook
        return !node.isEndOfWord && hasNoChildren(node);
    }
    return false;
}

How fast is this, really?

Here's the part that should feel almost unfair. Call the length of the word L and the number of stored words N:

Operation	Trie	HashSet
`insert(word)`	`O(L)`	`O(L)`
`search(word)`	`O(L)`	`O(L)`
`startsWith(prefix)`	`O(L)`	`O(N · L)` — scan everything
`autocomplete(prefix)`	`O(L + results)`	`O(N · L)` — scan everything

Read that startsWith row again. The trie's cost depends on the length of what you typed, not on how many words it stores — five words or five million, finding the ca node takes the same two steps. That independence from N is why autocomplete feels instant.

But unpack autocomplete's O(L + results), because the + results is subtler than it looks. The L is the walk down to the prefix node; the + results is enumerating everything hanging below it — and "results" here means the total characters across all completions, not the count of words. So autocomplete("a") on a big dictionary is expensive no matter how clever the trie is: a one-letter prefix hangs a near-whole-dictionary subtree, and the cost lives in the answer, not the lookup. That's why real autocompletes cap it — return the top-k, usually pre-ranked by a frequency stored on each node — instead of walking the entire subtree. The lookup was always cheap; bounding the output is the optimization that matters.

The price is memory. Every node carries 26 child slots, mostly null — and real alphabets are bigger than a–z. If your keys are sparse, Unicode, or just plain long, swap the array for a HashMap<Character, Node> (smaller, slightly slower) and look up radix trees, which compress lonely single-child chains like d → o → g into one edge.

The complete implementation

Everything we built, assembled into one class you can drop into a project or whiteboard from memory:

Trie.java

package dev.fiveyear.trie;
 
import java.util.ArrayList;
import java.util.List;
 
public final class Trie {
 
    private static final class Node {
        private final Node[] children = new Node[26];
        private boolean isEndOfWord;
    }
 
    private final Node root = new Node();
 
    /** Adds a word to the trie. O(L). */
    public void insert(String word) {
        Node current = root;
        for (char ch : word.toCharArray()) {
            int index = ch - 'a';
            if (current.children[index] == null) {
                current.children[index] = new Node();
            }
            current = current.children[index];
        }
        current.isEndOfWord = true;
    }
 
    /** True only if this exact word was inserted. O(L). */
    public boolean search(String word) {
        Node node = walk(word);
        return node != null && node.isEndOfWord;
    }
 
    /** True if any inserted word starts with this prefix. O(L). */
    public boolean startsWith(String prefix) {
        return walk(prefix) != null;
    }
 
    /** Every inserted word that starts with the prefix. O(L + results). */
    public List<String> autocomplete(String prefix) {
        List<String> results = new ArrayList<>();
        Node start = walk(prefix);
        if (start != null) {
            collect(start, new StringBuilder(prefix), results);
        }
        return results;
    }
 
    /** Removes a word, pruning nodes no other word needs. O(L). */
    public void delete(String word) {
        delete(root, word, 0);
    }
 
    // Follows the path for the given characters; null if the path breaks.
    private Node walk(String chars) {
        Node current = root;
        for (char ch : chars.toCharArray()) {
            current = current.children[ch - 'a'];
            if (current == null) {
                return null;
            }
        }
        return current;
    }
 
    private void collect(Node node, StringBuilder prefix, List<String> results) {
        if (node.isEndOfWord) {
            results.add(prefix.toString());
        }
        for (int i = 0; i < 26; i++) {
            if (node.children[i] != null) {
                prefix.append((char) ('a' + i));
                collect(node.children[i], prefix, results);
                prefix.deleteCharAt(prefix.length() - 1);
            }
        }
    }
 
    // Returns true if the parent may unhook its reference to this node.
    private boolean delete(Node node, String word, int depth) {
        if (node == null) {
            return false;
        }
        if (depth == word.length()) {
            if (!node.isEndOfWord) {
                return false;
            }
            node.isEndOfWord = false;
            return hasNoChildren(node);
        }
        int index = word.charAt(depth) - 'a';
        if (delete(node.children[index], word, depth + 1)) {
            node.children[index] = null;
            return !node.isEndOfWord && hasNoChildren(node);
        }
        return false;
    }
 
    private boolean hasNoChildren(Node node) {
        for (Node child : node.children) {
            if (child != null) {
                return false;
            }
        }
        return true;
    }
}

And here it is doing the things we drew:

Demo.java

Trie trie = new Trie();
trie.insert("cat");
trie.insert("car");
trie.insert("card");
trie.insert("care");
trie.insert("dog");
 
trie.search("car");         // true  — we inserted it
trie.search("ca");          // false — a prefix, not a word
trie.startsWith("ca");      // true  — cat, car, card, care live below
trie.autocomplete("car");   // [car, card, care]
 
trie.delete("card");
trie.autocomplete("car");   // [car, care]

Where to go from here

You now own the core idea: shared beginnings, shared paths, stars where words end. When you're ready to go deeper, three natural next stops:

Radix trees (compressed tries) — merge single-child chains so d → o → g becomes one edge labeled dog. Same idea, far less memory; this is what routers and many databases actually run.
Ternary search trees — a trie/BST hybrid that trades a little speed for a lot of space.
Aho–Corasick — a trie with failure links that can search a document for thousands of keywords in a single pass. Spam filters love it.

Next time your phone finishes your sentence, you'll know it's not magic — it's a little tree of letters, walking down a path you've been laying since the first key you pressed.

DSA