LeetCode #30 — Hard

Substring with Concatenation
of All Words

A complete visual walkthrough of the sliding window approach with word-sized steps.

Patterns Used

Roadmap

  1. Brute Force — Check Every Position
  2. The Core Insight — Sliding Window with Word Steps
  3. Walkthrough: "barfoothefoobarman"
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Demo
  7. Complexity Analysis
Step 01

Brute Force — Check Every Position

The naive approach: for each starting index in s, extract a substring of length wordCount * wordLen, split it into word-sized chunks, and check if those chunks form a permutation of the target words.

Example Setup

s = "barfoothefoobarman"
words = "foo" "bar"
wordLen = 3, wordCount = 2, totalLen = 6

We need to find starting indices where a 6-character substring is a concatenation of "foo" and "bar" (in any order).

b
a
r
f
o
o
t
h
e
f
o
o
b
a
r
m
a
n
Index: 0   1   2   3   4   5   6   7   8   9  10 11 12 13 14 15 16 17
Matches at indices [0, 9]

The brute force checks each of the ~n positions, splits into ~wordCount chunks, and compares each chunk. This gives O(n * wordCount * wordLen) — very slow for large inputs.

💡

The key observation: All words have equal length. This means we can slide a window that moves in word-sized steps, reusing work from previous positions instead of rechecking everything.

Step 02

The Core Insight — Sliding Window with Word Steps

Since all words have the same length w, we can treat the string as a sequence of word-sized slots. We run w different passes (starting at offsets 0, 1, ..., w-1) and in each pass, slide a window by one word at a time.

The Algorithm

1. Build a target frequency map
Count how many times each word appears in the words array.
2. For each offset i = 0 to wordLen-1
Start a sliding window at position i. Extract words at positions i, i+w, i+2w, ...
3. Expand right, shrink left
Add each new word to a window map. If it exceeds the target count, shrink from the left. If an unknown word appears, reset the window entirely. When count == wordCount, record the left index.

Why multiple offsets? Words could start at any character position mod wordLen. For wordLen=3, the word boundaries starting at index 0 are different from those starting at index 1 or 2. We run a pass for each possible alignment to cover all cases.

Step 03

Walkthrough: "barfoothefoobarman"

Let's trace the algorithm with s = "barfoothefoobarman" and words = ["foo", "bar"]. With wordLen=3, we run 3 passes (offsets 0, 1, 2). The interesting one is offset 0.

Offset 0: positions 0, 3, 6, 9, 12, 15

1
j=0: word = "bar"
Target word! window = {bar:1}, count=1. Need 2 words total.
2
j=3: word = "foo"
Target word! window = {bar:1, foo:1}, count=2. count == wordCount! Record left=0
3
j=6: word = "the"
Not a target word! Reset window. window = {}, count=0, left=9.
4
j=9: word = "foo"
Target word! window = {foo:1}, count=1.
5
j=12: word = "bar"
Target word! window = {foo:1, bar:1}, count=2. count == wordCount! Record left=9
6
j=15: word = "man"
Not a target word! Reset. Done with this offset.
Result: [0, 9]
Step 04

Edge Cases

Tricky Scenarios

Duplicate words
words = ["foo","foo"]. The frequency map has {foo:2}. The window must accumulate foo twice before it counts as a match. If foo appears a third time, shrink from the left.
Excess word count
When a valid word exceeds its target count, we shrink the window from the left until that word's count drops back to the target. This is the classic "shrink" step of sliding window.
Invalid word breaks the chain
When a non-target word appears, we can't include it in any valid concatenation. Clear the entire window and jump past it.
String shorter than total word length
If s.length() < wordCount * wordLen, no valid substring can exist. Return an empty list.
Step 05

Full Annotated Code

class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> result = new ArrayList<>();
        int wordLen = words[0].length();
        int wordCount = words.length;
        int totalLen = wordLen * wordCount;

        // Build target frequency map
        Map<String, Integer> target = new HashMap<>();
        for (String w : words)
            target.merge(w, 1, Integer::sum);

        // Try each starting offset (0 to wordLen-1)
        for (int i = 0; i < wordLen; i++) {
            Map<String, Integer> window = new HashMap<>();
            int left = i, count = 0;

            // Slide through word-sized chunks
            for (int j = i; j + wordLen <= s.length(); j += wordLen) {
                String word = s.substring(j, j + wordLen);

                if (target.containsKey(word)) {
                    window.merge(word, 1, Integer::sum);
                    count++;

                    // Shrink if this word exceeds target count
                    while (window.get(word) > target.get(word)) {
                        String leftWord = s.substring(left, left + wordLen);
                        window.merge(leftWord, -1, Integer::sum);
                        count--;
                        left += wordLen;
                    }

                    // All words matched!
                    if (count == wordCount) result.add(left);
                } else {
                    // Unknown word — reset everything
                    window.clear();
                    count = 0;
                    left = j + wordLen;
                }
            }
        }
        return result;
    }
}
Step 06

Interactive Demo

Enter a string and words to find all concatenation substring indices.

Sliding Window Visualizer



Press "Step →" or "Run All" to begin.
Step 07

Complexity Analysis

Time
O(n * w)
Space
O(m * w)

We run w passes (where w = word length). Each pass scans the entire string once, extracting substrings of length w at each step. Total: O(n * w) where n is the string length. The space is dominated by the two HashMaps holding at most m entries (m = number of words), each with keys of length w.

📝

Compared to brute force: The brute force is O(n * m * w) because it re-checks all words at every position. The sliding window eliminates the factor of m by reusing previous computations — when we slide right by one word, we only add one word and potentially remove one from the left.