LeetCode #187 — MEDIUM

Repeated DNA Sequences

Move from brute-force thinking to an efficient approach using hash map strategy.

Solve on LeetCode
The Problem

Problem Statement

The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'.

  • For example, "ACGAATTCCG" is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]

Example 2:

Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]

Constraints:

  • 1 <= s.length <= 105
  • s[i] is either 'A', 'C', 'G', or 'T'.
Patterns Used

Roadmap

  1. Brute Force Baseline
  2. Core Insight
  3. Algorithm Walkthrough
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Study Demo
  7. Complexity Analysis
Step 01

Brute Force Baseline

Problem summary: The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'. For example, "ACGAATTCCG" is a DNA sequence. When studying DNA, it is useful to identify repeated sequences within the DNA. Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Baseline thinking

Start with the most direct exhaustive search. That gives a correctness anchor before optimizing.

Pattern signal: Hash Map · Bit Manipulation · Sliding Window

Example 1

"AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Example 2

"AAAAAAAAAAAAA"
Step 02

Core Insight

What unlocks the optimal approach

  • No official hints in dataset. Start from constraints and look for a monotonic or reusable state.
Interview move: turn each hint into an invariant you can check after every iteration/recursion step.
Step 03

Algorithm Walkthrough

Iteration Checklist

  1. Define state (indices, window, stack, map, DP cell, or recursion frame).
  2. Apply one transition step and update the invariant.
  3. Record answer candidate when condition is met.
  4. Continue until all input is consumed.
Use the first example testcase as your mental trace to verify each transition.
Step 04

Edge Cases

Minimum Input
Single element / shortest valid input
Validate boundary behavior before entering the main loop or recursion.
Duplicates & Repeats
Repeated values / repeated states
Decide whether duplicates should be merged, skipped, or counted explicitly.
Extreme Constraints
Upper-end input sizes
Re-check complexity target against constraints to avoid time-limit issues.
Invalid / Corner Shape
Empty collections, zeros, or disconnected structures
Handle special-case structure before the core algorithm path.
Step 05

Full Annotated Code

Source-backed implementations are provided below for direct study and interview prep.

// Accepted solution for LeetCode #187: Repeated DNA Sequences
class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        Map<String, Integer> cnt = new HashMap<>();
        List<String> ans = new ArrayList<>();
        for (int i = 0; i < s.length() - 10 + 1; ++i) {
            String t = s.substring(i, i + 10);
            if (cnt.merge(t, 1, Integer::sum) == 2) {
                ans.add(t);
            }
        }
        return ans;
    }
}
Step 06

Interactive Study Demo

Use this to step through a reusable interview workflow for this problem.

Press Step or Run All to begin.
Step 07

Complexity Analysis

Time
O(n × 10)
Space
O(n × 10)

Approach Breakdown

SORT + SCAN
O(n log n) time
O(n) space

Sort the array in O(n log n), then scan for the missing or unique element by comparing adjacent pairs. Sorting requires O(n) auxiliary space (or O(1) with in-place sort but O(n log n) time remains). The sort step dominates.

BIT MANIPULATION
O(n) time
O(1) space

Bitwise operations (AND, OR, XOR, shifts) are O(1) per operation on fixed-width integers. A single pass through the input with bit operations gives O(n) time. The key insight: XOR of a number with itself is 0, which eliminates duplicates without extra space.

Shortcut: Bit operations are O(1). XOR cancels duplicates. Single pass → O(n) time, O(1) space.
Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Mutating counts without cleanup

Wrong move: Zero-count keys stay in map and break distinct/count constraints.

Usually fails on: Window/map size checks are consistently off by one.

Fix: Delete keys when count reaches zero.

Shrinking the window only once

Wrong move: Using `if` instead of `while` leaves the window invalid for multiple iterations.

Usually fails on: Over-limit windows stay invalid and produce wrong lengths/counts.

Fix: Shrink in a `while` loop until the invariant is valid again.