Skip to content
C Codeloom
DSA

LeetCode Word Break: DP with a Word Dictionary

Solve LeetCode 139 Word Break with bottom-up dynamic programming. Includes brute force, edge cases, complexity analysis, and an interview script.

·6 min read · By Yash Kesharwani
Intermediate 11 min read

What you'll learn

  • Frame Word Break as a 1D boolean DP problem
  • Use a set for O(1) dictionary lookups
  • Avoid exponential blowup with memoization
  • Reason about substring complexity carefully
  • Explain the recurrence in interview-ready language

Prerequisites

  • DP basics: see /blog/dynamic-programming-introduction
  • Recursion mechanics: see /blog/recursion-fundamentals

LeetCode 139, Word Break, is a deceptively important problem. Its solution shape, “can the prefix of length i be built?”, reappears in palindrome partitioning, sentence segmentation, and any segmentation-by-dictionary task. It also tests whether you remember to convert a list into a set for fast membership checks.

The Problem

Given a string s and a dictionary of strings word_dict, return True if s can be segmented into a space-separated sequence of one or more dictionary words. You may reuse the same word multiple times.

Example inputs and outputs:

Input:  s = "leetcode", word_dict = ["leet", "code"]
Output: True
Explanation: "leetcode" = "leet" + "code".

Input:  s = "applepenapple", word_dict = ["apple", "pen"]
Output: True
Explanation: "apple" + "pen" + "apple".

Input:  s = "catsandog", word_dict = ["cats", "dog", "sand", "and", "cat"]
Output: False

The natural DP state is: dp[i] is True if the first i characters of s can be segmented. Then dp[i] is True whenever there exists some j < i such that dp[j] is True and s[j:i] is in the dictionary.

Brute Force

The direct recursive attempt tries every split point:

def word_break(s: str, word_dict: list[str]) -> bool:
    words = set(word_dict)
    def helper(start: int) -> bool:
        if start == len(s):
            return True
        for end in range(start + 1, len(s) + 1):
            if s[start:end] in words and helper(end):
                return True
        return False
    return helper(0)

Without memoization the recursion can branch at every index, producing roughly O(2^n) calls in the worst case. A pathological input like s = "aaaaaaaab" with word_dict = ["a", "aa", "aaa", "aaaa"] will time out.

Optimal Solution

Bottom-up DP with a boolean array. Convert the dictionary to a set first so substring lookups are O(length-of-substring) for hashing rather than O(dictionary-size) for scanning.

def word_break(s: str, word_dict: list[str]) -> bool:
    words = set(word_dict)
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True
    for i in range(1, n + 1):
        for j in range(i):
            if dp[j] and s[j:i] in words:
                dp[i] = True
                break
    return dp[n]

dp[0] is True because the empty prefix is trivially segmentable. We then ask, for each ending index i, whether some split point j produces a valid prefix dp[j] and a dictionary word s[j:i].

A small optimization: track the maximum word length in the dictionary and only look back that far.

def word_break(s: str, word_dict: list[str]) -> bool:
    words = set(word_dict)
    max_len = max((len(w) for w in words), default=0)
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True
    for i in range(1, n + 1):
        for j in range(max(0, i - max_len), i):
            if dp[j] and s[j:i] in words:
                dp[i] = True
                break
    return dp[n]

This prunes the inner loop when dictionary words are much shorter than s.

Walk Through an Example

Take s = "leetcode", word_dict = ["leet", "code"]. Length 8.

dp[0] = True
i=1 "l"        no split works           dp[1] = False
i=2 "le"       no split works           dp[2] = False
i=3 "lee"      no split works           dp[3] = False
i=4 "leet"     dp[0]=True and "leet" in words  dp[4] = True
i=5 "leetc"    dp[4]=True but "c" not in words dp[5] = False
i=6 "leetco"   no valid split           dp[6] = False
i=7 "leetcod"  no valid split           dp[7] = False
i=8 "leetcode" dp[4]=True and "code" in words  dp[8] = True

Final answer: True. The reconstructed segmentation is "leet" + "code".

Edge Cases

  • Empty s: return True. dp[0] handles this.
  • Empty dictionary: return True only if s is empty.
  • s shorter than the shortest dictionary word: return False unless s itself is a word.
  • Duplicate words in word_dict: the set conversion deduplicates safely.
  • Repeated characters such as "aaaa...a": the max-length pruning prevents quadratic blowup from very long substrings.
  • Unicode strings: Python handles slicing and hashing correctly without special care.

Complexity Analysis

  • Time: O(n^2 * L) in the worst case, where n = len(s) and L is the average substring length being hashed. Strictly speaking each s[j:i] substring creation is O(i - j), and the set lookup is O(i - j) for hashing.
  • Space: O(n) for the DP array, plus O(total characters in dictionary) for the set.

The brute force without memoization is exponential. With memoization, the recursive version matches the bottom-up complexity. If you want to revisit growth-rate intuition, see /blog/big-o-notation-explained.

How to Explain It in an Interview

Use this script:

  1. State the DP definition: dp[i] means “the first i characters are segmentable.”
  2. Derive the transition: dp[i] = True if some j has dp[j] = True and s[j:i] is in the dictionary.
  3. Mention converting the list to a set up front; interviewers like the small efficiency call-out.
  4. Code the bottom-up version, then mention the max-length pruning as an optimization.
  5. Trace a small example to demonstrate correctness.
  6. State complexity honestly, including substring creation cost.

Bonus signal: contrast this with Word Break II (LeetCode 140), which asks for all segmentations and therefore requires backtracking rather than a boolean DP.

  • Word Break II (LeetCode 140): enumerate all valid segmentations.
  • Palindrome Partitioning (LeetCode 131): segmentation where every part must be a palindrome.
  • Concatenated Words (LeetCode 472): apply Word Break to every word in a list.
  • Decode Ways (LeetCode 91): a numeric segmentation analogue.
  • More segmentation DP variants live in /blog/dynamic-programming-classic-problems.

Wrap up

Word Break is the cleanest example of “prefix segmentation” DP. The state, the transition, and the dictionary-set optimization are short enough to memorize, yet they cover a wide family of harder problems. Practice this until the boolean DP feels mechanical, then move on to Word Break II and Palindrome Partitioning where the segmentation produces structured output rather than a yes-or-no answer. Pair this article with /blog/dynamic-programming-introduction for the theoretical grounding.