Pattern

Top K Elements

Efficiently find the k largest, smallest, or most frequent elements using heaps.

Roadmap

  1. What Is This Pattern?
  2. When to Use It
  3. Variant 1 — Kth Largest Element
  4. Variant 2 — Top K Frequent Elements
  5. Template Code
  6. Common Problems
  7. Interactive Demo
  8. Complexity Analysis
Step 01

What Is This Pattern?

The Top K Elements pattern uses a heap (priority queue) to maintain the top k elements as you scan through data. Instead of sorting the entire array, you keep a small heap of size k, which gives you dramatic performance improvements when k is much smaller than n.

The Core Idea

For k largest: use a min-heap of size k. The smallest element in the heap acts as the gatekeeper. For k smallest: use a max-heap of size k. The largest element in the heap acts as the gatekeeper.

Input
[3,1,5,12,2,11]
Min-Heap (k=3)
size ≤ 3
Result
[5, 11, 12]

As each element arrives, add it to the heap. When the heap exceeds size k, remove the minimum. After processing all elements, the heap holds exactly the k largest values.

Visual Walkthrough

Finding the 3 largest elements from [3, 1, 5, 12, 2, 11] using a min-heap of size 3:

Input
3
1
5
12
2
11
After processing 3 → heap: [3]
After processing 1 → heap: [1, 3]
After processing 5 → heap: [1, 3, 5] (size = k, full!)
After processing 12 → heap: [3, 5, 12] (evicted 1)
After processing 2 → heap: [3, 5, 12] (2 < root 3, skip)
After processing 11 → heap: [5, 11, 12] (evicted 3)
Result
5
11
12
💡

Why a min-heap for k largest? The min-heap's root is the smallest element among the current top k. Any new element larger than the root belongs in the top k, so we swap it in. Any element smaller cannot be in the top k, so we skip it. The root acts as a gatekeeper.

Step 02

When to Use It

Look for these signals in the problem statement. Any of these phrases strongly suggest the Top K Elements pattern:

"find k largest" "find k smallest" "k most frequent" "kth largest element" "sort by frequency" "top k" "k closest points"

Choosing the Right Heap

K Largest → Min-Heap
Root = smallest of top k. Evict root when a larger element arrives.
K Smallest → Max-Heap
Root = largest of bottom k. Evict root when a smaller element arrives.

Why not just sort? Sorting gives O(n log n). With a heap of size k, you get O(n log k). When k is much smaller than n (e.g., k=10 and n=1,000,000), the difference is enormous. You also do not need the full sorted order — just the top k.

Step 03

Variant 1 — Kth Largest Element

The simplest application. Maintain a min-heap of size k. After processing all elements, the root of the heap is the kth largest element.

Algorithm

  1. Create an empty min-heap.
  2. For each element in the array, add it to the heap.
  3. If the heap size exceeds k, remove the minimum (root).
  4. After all elements are processed, the root is the kth largest.

Example: k=3 from [3, 2, 1, 5, 6, 4]

We want the 3rd largest element. Sorted: [1, 2, 3, 4, 5, 6]. Answer = 4.

Process 3 → heap: [3]
Process 2 → heap: [2, 3]
Process 1 → heap: [1, 2, 3] — size = k
Process 5 → heap: [2, 3, 5] — evicted 1
Process 6 → heap: [3, 5, 6] — evicted 2
Process 4 → heap: [4, 5, 6] — evicted 3
heap.peek() = 4 ← the 3rd largest
// LeetCode #215 — Kth Largest Element in an Array
public int findKthLargest(int[] nums, int k) {
    PriorityQueue<Integer> minHeap = new PriorityQueue<>();

    for (int num : nums) {
        minHeap.offer(num);
        if (minHeap.size() > k) {
            minHeap.poll();   // evict the smallest
        }
    }

    return minHeap.peek();  // root = kth largest
}
🎯

Invariant: After processing each element, the heap contains the k largest elements seen so far. The root is always the kth largest. This invariant is what makes the pattern so elegant.

Step 04

Variant 2 — Top K Frequent Elements

First count frequencies with a HashMap. Then find the k most frequent elements. Two approaches: heap-based (general) and bucket sort (O(n) optimal).

Approach A: Heap-Based

Build a frequency map, then use a min-heap of size k on (element, frequency) pairs. The heap orders by frequency, keeping the k most frequent.

// LeetCode #347 — Top K Frequent Elements (Heap)
public int[] topKFrequent(int[] nums, int k) {
    // Step 1: Count frequencies
    Map<Integer, Integer> freq = new HashMap<>();
    for (int n : nums)
        freq.merge(n, 1, Integer::sum);

    // Step 2: Min-heap of size k, ordered by frequency
    PriorityQueue<int[]> heap = new PriorityQueue<>(
        (a, b) -> a[1] - b[1]
    );

    for (var entry : freq.entrySet()) {
        heap.offer(new int[]{entry.getKey(), entry.getValue()});
        if (heap.size() > k) heap.poll();
    }

    // Step 3: Extract results
    int[] result = new int[k];
    for (int i = 0; i < k; i++)
        result[i] = heap.poll()[0];
    return result;
}

Approach B: Bucket Sort (O(n))

Use the frequency as the bucket index. Since the maximum frequency is n, create an array of n+1 buckets. Elements with the same frequency go into the same bucket. Iterate buckets from highest to lowest.

// LeetCode #347 — Top K Frequent Elements (Bucket Sort)
public int[] topKFrequent(int[] nums, int k) {
    // Step 1: Count frequencies
    Map<Integer, Integer> freq = new HashMap<>();
    for (int n : nums)
        freq.merge(n, 1, Integer::sum);

    // Step 2: Create buckets indexed by frequency
    List<Integer>[] buckets = new List[nums.length + 1];
    for (var entry : freq.entrySet()) {
        int f = entry.getValue();
        if (buckets[f] == null)
            buckets[f] = new ArrayList<>();
        buckets[f].add(entry.getKey());
    }

    // Step 3: Collect from highest frequency down
    int[] result = new int[k];
    int idx = 0;
    for (int i = buckets.length - 1; i >= 0 && idx < k; i--) {
        if (buckets[i] != null)
            for (int val : buckets[i])
                if (idx < k) result[idx++] = val;
    }
    return result;
}
💡

Heap vs Bucket Sort: The heap approach works for any comparable metric and gives O(n log k). Bucket sort gives O(n) but only works when the "ranking value" (frequency) has a bounded range. In interviews, knowing both approaches demonstrates depth.

Step 05

Template Code

The general template for Top K Elements problems. Memorize this pattern and adapt it to specific problems.

General Template — Kth Largest / Top K Largest

// Template: Top K Elements using Min-Heap
// Finds the k largest elements (or kth largest)

PriorityQueue<Integer> minHeap = new PriorityQueue<>();

for (int num : nums) {
    minHeap.offer(num);
    if (minHeap.size() > k) {
        minHeap.poll();   // evict smallest
    }
}

// minHeap.peek() = kth largest element
// Heap contents = top k largest elements
return minHeap.peek();

Template — Top K with Custom Comparator

// For custom objects: use comparator on the ranking metric
// Example: K closest points to origin

PriorityQueue<int[]> maxHeap = new PriorityQueue<>(
    (a, b) -> dist(b) - dist(a)  // max-heap by distance
);

for (int[] point : points) {
    maxHeap.offer(point);
    if (maxHeap.size() > k) {
        maxHeap.poll();  // evict farthest
    }
}

// Heap contains k closest points
Step 06

Common Problems

These are the classic Top K Elements problems you should practice:

📝

Study order: Start with #215 (pure heap template), then #347 (frequency + heap), then #692 (adds tie-breaking logic), and finally #373 (multi-source heap expansion).

Step 07

Interactive Demo

Enter an array and a value for k. Watch the min-heap maintain size k as elements are processed one by one.

⚙ Top K Heap Visualizer



Step 08

Complexity Analysis

Time
O(n log k)
Space
O(k)

We process each of the n elements once. Each heap insertion/removal costs O(log k) since the heap never exceeds size k. Total: O(n log k). The heap stores at most k elements, so space is O(k).

Comparison with Alternatives

SORT
O(n log n)
HEAP (k)
O(n log k)
QUICKSELECT
O(n) avg

When k is small: log k is nearly constant. For k=10 and n=1,000,000, the heap approach does ~3.3 million comparisons versus ~20 million for full sort. The heap approach also works on streams of data where you cannot access the full dataset at once.