Sliding Window Median

Find median in each sliding window using two heaps

Language Selection

Choose your preferred programming language

Showing: Python

Sliding Window Median

Problem Statement

The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle values.

You are given an integer array nums and an integer k. There is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position, return the median of the current window.

Input/Output Specifications

  • Input:
    • nums: Array of integers
    • k: Size of sliding window (1 ≤ k ≤ nums.length)
  • Output: Array of medians for each window position

Constraints

  • 1 ≤ nums.length ≤ 10^5
  • 1 ≤ k ≤ nums.length
  • -2^31 ≤ nums[i] ≤ 2^31 - 1

Examples

Example 1:

Input: nums = [1,3,-1,-3,5,3,6,7], k = 3
Output: [1.0,-1.0,-1.0,3.0,5.0,6.0]
Explanation:
Window [1,3,-1] → median = 1
Window [3,-1,-3] → median = -1
Window [-1,-3,5] → median = -1
Window [-3,5,3] → median = 3
Window [5,3,6] → median = 5
Window [3,6,7] → median = 6

Example 2:

Input: nums = [1,2,3,4,2,3,1,4,2], k = 3
Output: [2.0,3.0,3.0,3.0,2.0,3.0,2.0]

Solution Approaches

Approach 1: Two Heaps (Optimal)

Algorithm Explanation:

  1. Use two heaps: max-heap for smaller half, min-heap for larger half
  2. Maintain balance: max-heap size ≥ min-heap size, difference ≤ 1
  3. For each window:
    • Remove element going out of window
    • Add new element to appropriate heap
    • Rebalance heaps if necessary
    • Calculate median from heap tops

Implementation:

Python:

import heapq
from collections import defaultdict

def median_sliding_window(nums, k):
    """
    Find median in sliding windows using two heaps
    Time: O(n log k)
    Space: O(k)
    """
    def get_median():
        if k % 2 == 1:
            return float(max_heap[0])
        else:
            return (max_heap[0] + (-min_heap[0])) / 2.0
    
    def add_number(num):
        if not max_heap or num <= max_heap[0]:
            heapq.heappush(max_heap, num)
        else:
            heapq.heappush(min_heap, -num)
        rebalance()
    
    def remove_number(num):
        if num <= max_heap[0]:
            max_heap_count[num] += 1
        else:
            min_heap_count[num] += 1
        rebalance()
    
    def rebalance():
        # Clean up heaps by removing invalid elements
        while max_heap and max_heap_count[max_heap[0]] > 0:
            max_heap_count[heapq.heappop(max_heap)] -= 1
        while min_heap and min_heap_count[-min_heap[0]] > 0:
            min_heap_count[-heapq.heappop(min_heap)] -= 1
        
        # Balance heap sizes
        valid_max_size = len(max_heap)
        valid_min_size = len(min_heap)
        
        if valid_max_size > valid_min_size + 1:
            heapq.heappush(min_heap, -heapq.heappop(max_heap))
        elif valid_min_size > valid_max_size:
            heapq.heappush(max_heap, -heapq.heappop(min_heap))
    
    max_heap = []  # For smaller half (max-heap using negative values)
    min_heap = []  # For larger half (min-heap)
    max_heap_count = defaultdict(int)  # Count of numbers to be removed from max_heap
    min_heap_count = defaultdict(int)  # Count of numbers to be removed from min_heap
    
    result = []
    
    # Initialize first window
    for i in range(k):
        add_number(-nums[i])  # Use negative for max-heap simulation
    max_heap = [-x for x in max_heap]  # Convert back to positive
    
    result.append(get_median())
    
    # Process remaining elements
    for i in range(k, len(nums)):
        # Remove element going out of window
        remove_number(nums[i - k])
        # Add new element
        add_number(nums[i])
        # Get median
        result.append(get_median())
    
    return result

# Cleaner implementation with proper two-heap structure
def median_sliding_window_v2(nums, k):
    """
    Cleaner implementation using two heaps
    Time: O(n log k)
    Space: O(k)
    """
    import heapq
    from collections import Counter
    
    def add_num(num, max_heap, min_heap):
        if not max_heap or num <= -max_heap[0]:
            heapq.heappush(max_heap, -num)
        else:
            heapq.heappush(min_heap, num)
    
    def balance_heaps(max_heap, min_heap):
        if len(max_heap) > len(min_heap) + 1:
            heapq.heappush(min_heap, -heapq.heappop(max_heap))
        elif len(min_heap) > len(max_heap):
            heapq.heappush(max_heap, -heapq.heappop(min_heap))
    
    def get_median(max_heap, min_heap, k):
        if k % 2 == 1:
            return float(-max_heap[0])
        else:
            return (-max_heap[0] + min_heap[0]) / 2.0
    
    max_heap = []  # Max heap for smaller half
    min_heap = []  # Min heap for larger half
    result = []
    
    for i in range(len(nums)):
        # Add current number
        add_num(nums[i], max_heap, min_heap)
        balance_heaps(max_heap, min_heap)
        
        # Remove element if window size exceeds k
        if i >= k:
            outgoing = nums[i - k]
            if outgoing <= -max_heap[0]:
                max_heap.remove(-outgoing)
                heapq.heapify(max_heap)
            else:
                min_heap.remove(outgoing)
                heapq.heapify(min_heap)
            
            balance_heaps(max_heap, min_heap)
        
        # Add median if we have a full window
        if i >= k - 1:
            result.append(get_median(max_heap, min_heap, k))
    
    return result

Java:

import java.util.*;

class Solution {
    /**
     * Find median in sliding windows using two heaps
     * Time: O(n log k)
     * Space: O(k)
     */
    public double[] medianSlidingWindow(int[] nums, int k) {
        double[] result = new double[nums.length - k + 1];
        
        // Max heap for smaller half
        PriorityQueue<Integer> maxHeap = new PriorityQueue<>(Collections.reverseOrder());
        // Min heap for larger half  
        PriorityQueue<Integer> minHeap = new PriorityQueue<>();
        
        for (int i = 0; i < nums.length; i++) {
            // Add current number
            addNumber(nums[i], maxHeap, minHeap);
            
            // Remove element if window size exceeds k
            if (i >= k) {
                removeNumber(nums[i - k], maxHeap, minHeap);
            }
            
            // Add median if we have a full window
            if (i >= k - 1) {
                result[i - k + 1] = getMedian(maxHeap, minHeap, k);
            }
        }
        
        return result;
    }
    
    private void addNumber(int num, PriorityQueue<Integer> maxHeap, PriorityQueue<Integer> minHeap) {
        if (maxHeap.isEmpty() || num <= maxHeap.peek()) {
            maxHeap.offer(num);
        } else {
            minHeap.offer(num);
        }
        balanceHeaps(maxHeap, minHeap);
    }
    
    private void removeNumber(int num, PriorityQueue<Integer> maxHeap, PriorityQueue<Integer> minHeap) {
        if (num <= maxHeap.peek()) {
            maxHeap.remove(num);
        } else {
            minHeap.remove(num);
        }
        balanceHeaps(maxHeap, minHeap);
    }
    
    private void balanceHeaps(PriorityQueue<Integer> maxHeap, PriorityQueue<Integer> minHeap) {
        if (maxHeap.size() > minHeap.size() + 1) {
            minHeap.offer(maxHeap.poll());
        } else if (minHeap.size() > maxHeap.size()) {
            maxHeap.offer(minHeap.poll());
        }
    }
    
    private double getMedian(PriorityQueue<Integer> maxHeap, PriorityQueue<Integer> minHeap, int k) {
        if (k % 2 == 1) {
            return maxHeap.peek();
        } else {
            return ((long) maxHeap.peek() + minHeap.peek()) / 2.0;
        }
    }
}

Go:

import (
    "container/heap"
)

type MaxHeap []int
func (h MaxHeap) Len() int           { return len(h) }
func (h MaxHeap) Less(i, j int) bool { return h[i] > h[j] }
func (h MaxHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }
func (h *MaxHeap) Push(x interface{}) { *h = append(*h, x.(int)) }
func (h *MaxHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[0 : n-1]
    return x
}

type MinHeap []int
func (h MinHeap) Len() int           { return len(h) }
func (h MinHeap) Less(i, j int) bool { return h[i] < h[j] }
func (h MinHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }
func (h *MinHeap) Push(x interface{}) { *h = append(*h, x.(int)) }
func (h *MinHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[0 : n-1]
    return x
}

// medianSlidingWindow - Find median in sliding windows using two heaps
// Time: O(n log k), Space: O(k)
func medianSlidingWindow(nums []int, k int) []float64 {
    maxHeap := &MaxHeap{}
    minHeap := &MinHeap{}
    heap.Init(maxHeap)
    heap.Init(minHeap)
    
    var result []float64
    
    for i := 0; i < len(nums); i++ {
        // Add current number
        addNumber(nums[i], maxHeap, minHeap)
        
        // Remove element if window size exceeds k
        if i >= k {
            removeNumber(nums[i-k], maxHeap, minHeap)
        }
        
        // Add median if we have a full window
        if i >= k-1 {
            result = append(result, getMedian(maxHeap, minHeap, k))
        }
    }
    
    return result
}

func addNumber(num int, maxHeap *MaxHeap, minHeap *MinHeap) {
    if maxHeap.Len() == 0 || num <= (*maxHeap)[0] {
        heap.Push(maxHeap, num)
    } else {
        heap.Push(minHeap, num)
    }
    balanceHeaps(maxHeap, minHeap)
}

func removeNumber(num int, maxHeap *MaxHeap, minHeap *MinHeap) {
    if maxHeap.Len() > 0 && num <= (*maxHeap)[0] {
        removeFromHeap(maxHeap, num)
    } else {
        removeFromHeap(minHeap, num)
    }
    balanceHeaps(maxHeap, minHeap)
}

func removeFromHeap(h heap.Interface, num int) {
    // Linear search and remove (O(n) operation)
    // In practice, might use more efficient data structures
    switch h := h.(type) {
    case *MaxHeap:
        for i, val := range *h {
            if val == num {
                heap.Remove(h, i)
                break
            }
        }
    case *MinHeap:
        for i, val := range *h {
            if val == num {
                heap.Remove(h, i)
                break
            }
        }
    }
}

func balanceHeaps(maxHeap *MaxHeap, minHeap *MinHeap) {
    if maxHeap.Len() > minHeap.Len()+1 {
        heap.Push(minHeap, heap.Pop(maxHeap))
    } else if minHeap.Len() > maxHeap.Len() {
        heap.Push(maxHeap, heap.Pop(minHeap))
    }
}

func getMedian(maxHeap *MaxHeap, minHeap *MinHeap, k int) float64 {
    if k%2 == 1 {
        return float64((*maxHeap)[0])
    }
    return float64((*maxHeap)[0]+(*minHeap)[0]) / 2.0
}

JavaScript:

/**
 * Find median in sliding windows using two heaps
 * Time: O(n log k)
 * Space: O(k)
 */
function medianSlidingWindow(nums, k) {
    const maxHeap = new MaxHeap(); // Smaller half
    const minHeap = new MinHeap(); // Larger half
    const result = [];
    
    for (let i = 0; i < nums.length; i++) {
        // Add current number
        addNumber(nums[i], maxHeap, minHeap);
        
        // Remove element if window size exceeds k
        if (i >= k) {
            removeNumber(nums[i - k], maxHeap, minHeap);
        }
        
        // Add median if we have a full window
        if (i >= k - 1) {
            result.push(getMedian(maxHeap, minHeap, k));
        }
    }
    
    return result;
}

function addNumber(num, maxHeap, minHeap) {
    if (maxHeap.size() === 0 || num <= maxHeap.peek()) {
        maxHeap.push(num);
    } else {
        minHeap.push(num);
    }
    balanceHeaps(maxHeap, minHeap);
}

function removeNumber(num, maxHeap, minHeap) {
    if (maxHeap.size() > 0 && num <= maxHeap.peek()) {
        maxHeap.remove(num);
    } else {
        minHeap.remove(num);
    }
    balanceHeaps(maxHeap, minHeap);
}

function balanceHeaps(maxHeap, minHeap) {
    if (maxHeap.size() > minHeap.size() + 1) {
        minHeap.push(maxHeap.pop());
    } else if (minHeap.size() > maxHeap.size()) {
        maxHeap.push(minHeap.pop());
    }
}

function getMedian(maxHeap, minHeap, k) {
    if (k % 2 === 1) {
        return maxHeap.peek();
    }
    return (maxHeap.peek() + minHeap.peek()) / 2.0;
}

Key Insights

  • Two-Heap Strategy: Maintain balance between smaller and larger halves
  • Lazy Deletion: Mark elements for removal instead of immediate deletion
  • Window Management: Add new element and remove old element for each slide
  • Balance Invariant: Max-heap size ≥ min-heap size, difference ≤ 1

Edge Cases

  1. k = 1: Each element is its own median
  2. k = n: Single window covering entire array
  3. Even k: Median is average of two middle elements
  4. Duplicate Elements: Handle multiple occurrences correctly
  5. Integer Overflow: Be careful with sum operations for median

Test Cases

def test_median_sliding_window():
    # Example 1
    result1 = median_sliding_window([1,3,-1,-3,5,3,6,7], 3)
    expected1 = [1.0,-1.0,-1.0,3.0,5.0,6.0]
    assert len(result1) == len(expected1)
    
    # k = 1 (each element is median)
    result2 = median_sliding_window([1,2,3,4], 1)
    assert result2 == [1.0,2.0,3.0,4.0]
    
    # Even window size
    result3 = median_sliding_window([1,2,3,4], 2)
    assert result3 == [1.5,2.5,3.5]
    
    # Single window
    result4 = median_sliding_window([1,2,3], 3)
    assert result4 == [2.0]
    
    # Duplicate elements
    result5 = median_sliding_window([1,1,1,1], 2)
    assert result5 == [1.0,1.0,1.0]
    
    print("All tests passed!")

test_median_sliding_window()

Interview Tips

  1. Two-Heap Pattern: Recognize this as classic two-heap median problem
  2. Balance Strategy: Explain heap size maintenance clearly
  3. Removal Challenge: Discuss lazy deletion vs immediate removal
  4. Integer Overflow: Mention potential overflow in median calculation
  5. Alternative Approaches: Could mention multiset/TreeMap alternatives

Follow-up Questions

  1. Kth Smallest in Window: Find kth smallest instead of median
  2. Running Median: Handle continuous stream without fixed window
  3. Memory Optimization: What if k is very large?
  4. Multiple Queries: Handle multiple different window sizes
  5. Approximate Median: Allow some error tolerance for efficiency

Common Mistakes

  1. Wrong Balance: Not maintaining proper heap size balance
  2. Removal Issues: Inefficient or incorrect element removal
  3. Index Errors: Off-by-one in window boundary calculations
  4. Integer Overflow: Not handling large number sums properly
  5. Edge Cases: Not handling k=1 or k=n correctly

Concept Explanations

Two-Heap Median Strategy: We maintain two heaps where the max-heap contains the smaller half of elements and min-heap contains the larger half. The median is either the top of max-heap (odd k) or average of both tops (even k).

Lazy Deletion: Instead of immediately removing elements from heaps (expensive operation), we can mark them for removal and clean them up lazily when they reach the top.

When to Apply: This pattern is crucial for any sliding window problem involving order statistics (median, kth smallest/largest). It’s also the foundation for data stream median problems.