Top K Frequent Elements

Find the k most frequent elements in an array

Language Selection

Choose your preferred programming language

Showing: Python

Top K Frequent Elements

Problem Statement

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Input/Output Specifications

  • Input:
    • nums: Array of integers
    • k: Integer (1 <= k <= number of unique elements)
  • Output: Array of k most frequent elements

Constraints

  • 1 <= nums.length <= 10^5
  • k is in the range [1, the number of unique elements in the array]
  • -10^4 <= nums[i] <= 10^4
  • It’s guaranteed that the answer is unique

Examples

Example 1:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2] or [2,1]
Explanation: Elements 1 and 2 appear 3 and 2 times respectively.

Example 2:

Input: nums = [1], k = 1
Output: [1]
Explanation: Only one element, so it's the most frequent.

Solution Approaches

Approach 1: Min-Heap (Priority Queue) - Optimal for Small K

Algorithm Explanation:

  1. Count frequency of each element using hash map
  2. Use min-heap of size k to maintain k most frequent elements
  3. For each unique element:
    • If heap size < k, add (frequency, element)
    • If frequency > heap top frequency, replace heap top
  4. Extract all elements from heap

Implementation:

Python:

import heapq
from collections import Counter

def top_k_frequent_heap(nums, k):
    """
    Find k most frequent elements using min-heap
    Time: O(n log k)
    Space: O(n) for frequency map + O(k) for heap
    """
    if k == len(set(nums)):
        return list(set(nums))
    
    # Count frequencies
    count = Counter(nums)
    
    # Min-heap to maintain k most frequent elements
    heap = []
    
    for num, freq in count.items():
        if len(heap) < k:
            heapq.heappush(heap, (freq, num))
        elif freq > heap[0][0]:
            heapq.heapreplace(heap, (freq, num))
    
    # Extract elements from heap
    return [num for freq, num in heap]

# Alternative using heapq.nlargest
def top_k_frequent_builtin(nums, k):
    """
    Find k most frequent elements using built-in heap function
    Time: O(n log k)
    Space: O(n)
    """
    count = Counter(nums)
    return heapq.nlargest(k, count.keys(), key=count.get)

Java:

import java.util.*;

class Solution {
    /**
     * Find k most frequent elements using min-heap
     * Time: O(n log k)
     * Space: O(n)
     */
    public int[] topKFrequent(int[] nums, int k) {
        // Count frequencies
        Map<Integer, Integer> count = new HashMap<>();
        for (int num : nums) {
            count.put(num, count.getOrDefault(num, 0) + 1);
        }
        
        // Min-heap based on frequency
        PriorityQueue<int[]> heap = new PriorityQueue<>(
            (a, b) -> Integer.compare(a[1], b[1]) // Compare frequencies
        );
        
        for (Map.Entry<Integer, Integer> entry : count.entrySet()) {
            int num = entry.getKey();
            int freq = entry.getValue();
            
            if (heap.size() < k) {
                heap.offer(new int[]{num, freq});
            } else if (freq > heap.peek()[1]) {
                heap.poll();
                heap.offer(new int[]{num, freq});
            }
        }
        
        // Extract results
        int[] result = new int[k];
        for (int i = 0; i < k; i++) {
            result[i] = heap.poll()[0];
        }
        
        return result;
    }
}

Go:

import (
    "container/heap"
)

type FreqElement struct {
    num  int
    freq int
}

type MinHeap []FreqElement

func (h MinHeap) Len() int           { return len(h) }
func (h MinHeap) Less(i, j int) bool { return h[i].freq < h[j].freq }
func (h MinHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }

func (h *MinHeap) Push(x interface{}) {
    *h = append(*h, x.(FreqElement))
}

func (h *MinHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[0 : n-1]
    return x
}

// topKFrequent - Find k most frequent elements using min-heap
// Time: O(n log k), Space: O(n)
func topKFrequent(nums []int, k int) []int {
    // Count frequencies
    count := make(map[int]int)
    for _, num := range nums {
        count[num]++
    }
    
    // Min-heap for k most frequent elements
    minHeap := &MinHeap{}
    heap.Init(minHeap)
    
    for num, freq := range count {
        if minHeap.Len() < k {
            heap.Push(minHeap, FreqElement{num, freq})
        } else if freq > (*minHeap)[0].freq {
            heap.Pop(minHeap)
            heap.Push(minHeap, FreqElement{num, freq})
        }
    }
    
    // Extract results
    result := make([]int, k)
    for i := 0; i < k; i++ {
        element := heap.Pop(minHeap).(FreqElement)
        result[i] = element.num
    }
    
    return result
}

JavaScript:

class MinHeap {
    constructor() {
        this.heap = [];
    }
    
    push(element) {
        this.heap.push(element);
        this.bubbleUp();
    }
    
    pop() {
        if (this.heap.length === 0) return null;
        if (this.heap.length === 1) return this.heap.pop();
        
        const min = this.heap[0];
        this.heap[0] = this.heap.pop();
        this.bubbleDown();
        return min;
    }
    
    peek() {
        return this.heap[0];
    }
    
    size() {
        return this.heap.length;
    }
    
    bubbleUp() {
        let idx = this.heap.length - 1;
        while (idx > 0) {
            const parentIdx = Math.floor((idx - 1) / 2);
            if (this.heap[parentIdx].freq <= this.heap[idx].freq) break;
            [this.heap[parentIdx], this.heap[idx]] = [this.heap[idx], this.heap[parentIdx]];
            idx = parentIdx;
        }
    }
    
    bubbleDown() {
        let idx = 0;
        while (2 * idx + 1 < this.heap.length) {
            let minChildIdx = 2 * idx + 1;
            if (2 * idx + 2 < this.heap.length && 
                this.heap[2 * idx + 2].freq < this.heap[minChildIdx].freq) {
                minChildIdx = 2 * idx + 2;
            }
            
            if (this.heap[idx].freq <= this.heap[minChildIdx].freq) break;
            [this.heap[idx], this.heap[minChildIdx]] = [this.heap[minChildIdx], this.heap[idx]];
            idx = minChildIdx;
        }
    }
}

/**
 * Find k most frequent elements using min-heap
 * Time: O(n log k)
 * Space: O(n)
 */
function topKFrequent(nums, k) {
    // Count frequencies
    const count = new Map();
    for (const num of nums) {
        count.set(num, (count.get(num) || 0) + 1);
    }
    
    // Min-heap for k most frequent elements
    const minHeap = new MinHeap();
    
    for (const [num, freq] of count) {
        if (minHeap.size() < k) {
            minHeap.push({num, freq});
        } else if (freq > minHeap.peek().freq) {
            minHeap.pop();
            minHeap.push({num, freq});
        }
    }
    
    // Extract results
    const result = [];
    while (minHeap.size() > 0) {
        result.push(minHeap.pop().num);
    }
    
    return result;
}

C#:

using System;
using System.Collections.Generic;

public class Solution {
    /// <summary>
    /// Find k most frequent elements using min-heap
    /// Time: O(n log k)
    /// Space: O(n)
    /// </summary>
    public int[] TopKFrequent(int[] nums, int k) {
        // Count frequencies
        var count = new Dictionary<int, int>();
        foreach (int num in nums) {
            count[num] = count.GetValueOrDefault(num, 0) + 1;
        }
        
        // Min-heap for k most frequent elements
        var minHeap = new PriorityQueue<(int num, int freq), int>();
        
        foreach (var kvp in count) {
            int num = kvp.Key;
            int freq = kvp.Value;
            
            if (minHeap.Count < k) {
                minHeap.Enqueue((num, freq), freq);
            } else if (freq > minHeap.Peek().freq) {
                minHeap.Dequeue();
                minHeap.Enqueue((num, freq), freq);
            }
        }
        
        // Extract results
        int[] result = new int[k];
        for (int i = 0; i < k; i++) {
            result[i] = minHeap.Dequeue().num;
        }
        
        return result;
    }
}

Approach 2: Bucket Sort (Optimal - O(n))

Algorithm Explanation:

  1. Count frequency of each element
  2. Create buckets where bucket[i] contains elements with frequency i
  3. Iterate from highest frequency to lowest, collecting k elements

Implementation:

Python:

def top_k_frequent_bucket(nums, k):
    """
    Find k most frequent elements using bucket sort
    Time: O(n)
    Space: O(n)
    """
    if k == len(set(nums)):
        return list(set(nums))
    
    # Count frequencies
    count = Counter(nums)
    
    # Create buckets: bucket[i] = list of elements with frequency i
    n = len(nums)
    bucket = [[] for _ in range(n + 1)]
    
    for num, freq in count.items():
        bucket[freq].append(num)
    
    # Collect k most frequent elements
    result = []
    for freq in range(n, 0, -1):  # Start from highest frequency
        for num in bucket[freq]:
            result.append(num)
            if len(result) == k:
                return result
    
    return result

Approach 3: Max-Heap (Alternative)

Algorithm Explanation:

  1. Count frequencies of all elements
  2. Use max-heap to store all (frequency, element) pairs
  3. Extract top k elements from heap

Implementation:

Python:

def top_k_frequent_maxheap(nums, k):
    """
    Find k most frequent elements using max-heap
    Time: O(n log n)
    Space: O(n)
    """
    count = Counter(nums)
    
    # Max-heap (use negative frequencies for min-heap)
    heap = [(-freq, num) for num, freq in count.items()]
    heapq.heapify(heap)
    
    # Extract k most frequent
    result = []
    for _ in range(k):
        freq, num = heapq.heappop(heap)
        result.append(num)
    
    return result

Key Insights

  • Frequency Mapping: Always start by counting element frequencies
  • Heap Size: Min-heap approach keeps heap size at most k
  • Bucket Sort: Optimal O(n) solution using frequency as bucket index
  • Trade-offs: Heap is better for small k, bucket sort for large k

Edge Cases

  1. k = 1: Return single most frequent element
  2. All Unique: Each element appears once
  3. All Same: All elements have same frequency
  4. Single Element: Array with one element
  5. k = unique count: Return all unique elements

Test Cases

def test_top_k_frequent():
    # Basic case
    result1 = top_k_frequent_heap([1,1,1,2,2,3], 2)
    assert set(result1) == {1, 2}
    
    # Single element
    assert top_k_frequent_heap([1], 1) == [1]
    
    # All same frequency
    result2 = top_k_frequent_heap([1,2,3], 2)
    assert len(result2) == 2 and len(set(result2)) == 2
    
    # All same element
    assert top_k_frequent_heap([1,1,1,1], 1) == [1]
    
    # Large k
    result3 = top_k_frequent_heap([1,2,3,4,5], 5)
    assert set(result3) == {1,2,3,4,5}
    
    # Test bucket sort
    result4 = top_k_frequent_bucket([1,1,1,2,2,3], 2)
    assert set(result4) == {1, 2}
    
    print("All tests passed!")

test_top_k_frequent()

Interview Tips

  1. Ask About Ties: How to handle elements with same frequency?
  2. Discuss Approaches: Start with heap, mention bucket sort optimization
  3. Complexity Analysis: Compare O(n log k) vs O(n) trade-offs
  4. Memory Considerations: Hash map space is always O(n)
  5. Follow-up Ready: Be prepared for related problems

Follow-up Questions

  1. Top K Frequent Words: Handle strings instead of numbers
  2. Streaming Data: Handle continuous stream of elements
  3. Memory Constraint: What if frequency map doesn’t fit in memory?
  4. Kth Most Frequent: Return only the kth most frequent element
  5. Least Frequent: Find k least frequent elements

Common Mistakes

  1. Wrong Heap Type: Using max-heap when min-heap is more efficient
  2. Frequency Extraction: Not properly extracting elements from heap
  3. Edge Case: Not handling k = number of unique elements
  4. Bucket Index: Off-by-one errors in bucket sort implementation
  5. Result Order: Not considering that order doesn’t matter

Concept Explanations

Min-Heap for Top K: We use a min-heap of size k because we want to maintain the k largest frequencies. The smallest frequency in this set (at heap root) serves as the threshold for new elements.

Bucket Sort Optimization: Since frequencies are bounded by array length (0 to n), we can use frequency as an index into a bucket array. This gives us O(n) time complexity.

When to Apply: Use heap approach when k is much smaller than the number of unique elements. Use bucket sort when k is large relative to unique elements or when you need guaranteed O(n) time complexity.

Frequency Analysis Pattern: This problem demonstrates the common pattern of: count frequencies → process by frequency → return top/bottom k elements.