Merge K Sorted Arrays

Merge k sorted arrays into one sorted array using heap

Language Selection

Choose your preferred programming language

Showing: Python

Merge K Sorted Arrays

Problem Statement

Given k sorted arrays, merge them into a single sorted array. Each array can have different lengths.

Input/Output Specifications

  • Input:
    • arrays: List of k sorted arrays (each array sorted in ascending order)
  • Output: Single sorted array containing all elements from input arrays

Constraints

  • 1 <= k <= 10^4
  • 0 <= arrays[i].length <= 500
  • -10^9 <= arrays[i][j] <= 10^9
  • Each arrays[i] is sorted in ascending order

Examples

Example 1:

Input: arrays = [[1,4,5],[1,3,4],[2,6]]
Output: [1,1,2,3,4,4,5,6]
Explanation: Merging [1,4,5], [1,3,4], and [2,6] results in [1,1,2,3,4,4,5,6].

Example 2:

Input: arrays = [[1,2,3],[4,5,6],[7,8,9]]
Output: [1,2,3,4,5,6,7,8,9]
Explanation: All arrays merged in order.

Example 3:

Input: arrays = [[], [1], [1,2]]
Output: [1,1,2]
Explanation: Empty arrays are handled gracefully.

Solution Approaches

Approach 1: Min-Heap (Priority Queue) - Optimal

Algorithm Explanation:

  1. Create a min-heap to track the smallest element from each array
  2. Initialize heap with first element from each non-empty array
  3. While heap is not empty:
    • Extract minimum element and add to result
    • If the extracted element’s array has more elements, add next element to heap
  4. Return merged result

Implementation:

Python:

import heapq

def merge_k_sorted_arrays(arrays):
    """
    Merge k sorted arrays using min-heap
    Time: O(n log k) where n is total elements
    Space: O(k) for heap + O(n) for result
    """
    if not arrays:
        return []
    
    result = []
    min_heap = []
    
    # Initialize heap with first element from each non-empty array
    for i, array in enumerate(arrays):
        if array:  # Skip empty arrays
            heapq.heappush(min_heap, (array[0], i, 0))  # (value, array_index, element_index)
    
    while min_heap:
        value, array_idx, element_idx = heapq.heappop(min_heap)
        result.append(value)
        
        # If there are more elements in this array, add next element
        if element_idx + 1 < len(arrays[array_idx]):
            next_value = arrays[array_idx][element_idx + 1]
            heapq.heappush(min_heap, (next_value, array_idx, element_idx + 1))
    
    return result

# Alternative implementation with custom class for clarity
class ArrayElement:
    def __init__(self, value, array_idx, element_idx):
        self.value = value
        self.array_idx = array_idx
        self.element_idx = element_idx
    
    def __lt__(self, other):
        return self.value < other.value

def merge_k_sorted_arrays_v2(arrays):
    """
    Merge k sorted arrays using min-heap with custom class
    Time: O(n log k)
    Space: O(k) for heap + O(n) for result
    """
    if not arrays:
        return []
    
    result = []
    min_heap = []
    
    # Initialize heap
    for i, array in enumerate(arrays):
        if array:
            heapq.heappush(min_heap, ArrayElement(array[0], i, 0))
    
    while min_heap:
        element = heapq.heappop(min_heap)
        result.append(element.value)
        
        # Add next element from same array if exists
        if element.element_idx + 1 < len(arrays[element.array_idx]):
            next_val = arrays[element.array_idx][element.element_idx + 1]
            heapq.heappush(min_heap, ArrayElement(next_val, element.array_idx, element.element_idx + 1))
    
    return result

Java:

import java.util.*;

class Solution {
    /**
     * Merge k sorted arrays using min-heap
     * Time: O(n log k) where n is total elements
     * Space: O(k) for heap + O(n) for result
     */
    public int[] mergeKSortedArrays(int[][] arrays) {
        if (arrays == null || arrays.length == 0) {
            return new int[0];
        }
        
        List<Integer> result = new ArrayList<>();
        PriorityQueue<ArrayElement> minHeap = new PriorityQueue<>();
        
        // Initialize heap with first element from each non-empty array
        for (int i = 0; i < arrays.length; i++) {
            if (arrays[i].length > 0) {
                minHeap.offer(new ArrayElement(arrays[i][0], i, 0));
            }
        }
        
        while (!minHeap.isEmpty()) {
            ArrayElement element = minHeap.poll();
            result.add(element.value);
            
            // Add next element from same array if exists
            if (element.elementIndex + 1 < arrays[element.arrayIndex].length) {
                int nextValue = arrays[element.arrayIndex][element.elementIndex + 1];
                minHeap.offer(new ArrayElement(nextValue, element.arrayIndex, element.elementIndex + 1));
            }
        }
        
        return result.stream().mapToInt(i -> i).toArray();
    }
    
    class ArrayElement implements Comparable<ArrayElement> {
        int value;
        int arrayIndex;
        int elementIndex;
        
        ArrayElement(int value, int arrayIndex, int elementIndex) {
            this.value = value;
            this.arrayIndex = arrayIndex;
            this.elementIndex = elementIndex;
        }
        
        @Override
        public int compareTo(ArrayElement other) {
            return Integer.compare(this.value, other.value);
        }
    }
}

Go:

import (
    "container/heap"
)

type ArrayElement struct {
    value        int
    arrayIndex   int
    elementIndex int
}

type MinHeap []ArrayElement

func (h MinHeap) Len() int           { return len(h) }
func (h MinHeap) Less(i, j int) bool { return h[i].value < h[j].value }
func (h MinHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }

func (h *MinHeap) Push(x interface{}) {
    *h = append(*h, x.(ArrayElement))
}

func (h *MinHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[0 : n-1]
    return x
}

// mergeKSortedArrays - Merge k sorted arrays using min-heap
// Time: O(n log k), Space: O(k) + O(n)
func mergeKSortedArrays(arrays [][]int) []int {
    if len(arrays) == 0 {
        return []int{}
    }
    
    var result []int
    minHeap := &MinHeap{}
    heap.Init(minHeap)
    
    // Initialize heap with first element from each non-empty array
    for i, arr := range arrays {
        if len(arr) > 0 {
            heap.Push(minHeap, ArrayElement{
                value:        arr[0],
                arrayIndex:   i,
                elementIndex: 0,
            })
        }
    }
    
    for minHeap.Len() > 0 {
        element := heap.Pop(minHeap).(ArrayElement)
        result = append(result, element.value)
        
        // Add next element from same array if exists
        if element.elementIndex+1 < len(arrays[element.arrayIndex]) {
            nextValue := arrays[element.arrayIndex][element.elementIndex+1]
            heap.Push(minHeap, ArrayElement{
                value:        nextValue,
                arrayIndex:   element.arrayIndex,
                elementIndex: element.elementIndex + 1,
            })
        }
    }
    
    return result
}

JavaScript:

class MinHeap {
    constructor() {
        this.heap = [];
    }
    
    push(element) {
        this.heap.push(element);
        this.bubbleUp();
    }
    
    pop() {
        if (this.heap.length === 0) return null;
        if (this.heap.length === 1) return this.heap.pop();
        
        const min = this.heap[0];
        this.heap[0] = this.heap.pop();
        this.bubbleDown();
        return min;
    }
    
    size() {
        return this.heap.length;
    }
    
    bubbleUp() {
        let idx = this.heap.length - 1;
        while (idx > 0) {
            const parentIdx = Math.floor((idx - 1) / 2);
            if (this.heap[parentIdx].value <= this.heap[idx].value) break;
            [this.heap[parentIdx], this.heap[idx]] = [this.heap[idx], this.heap[parentIdx]];
            idx = parentIdx;
        }
    }
    
    bubbleDown() {
        let idx = 0;
        while (2 * idx + 1 < this.heap.length) {
            let minChildIdx = 2 * idx + 1;
            if (2 * idx + 2 < this.heap.length && 
                this.heap[2 * idx + 2].value < this.heap[minChildIdx].value) {
                minChildIdx = 2 * idx + 2;
            }
            
            if (this.heap[idx].value <= this.heap[minChildIdx].value) break;
            [this.heap[idx], this.heap[minChildIdx]] = [this.heap[minChildIdx], this.heap[idx]];
            idx = minChildIdx;
        }
    }
}

/**
 * Merge k sorted arrays using min-heap
 * Time: O(n log k) where n is total elements
 * Space: O(k) for heap + O(n) for result
 */
function mergeKSortedArrays(arrays) {
    if (!arrays || arrays.length === 0) {
        return [];
    }
    
    const result = [];
    const minHeap = new MinHeap();
    
    // Initialize heap with first element from each non-empty array
    for (let i = 0; i < arrays.length; i++) {
        if (arrays[i].length > 0) {
            minHeap.push({
                value: arrays[i][0],
                arrayIndex: i,
                elementIndex: 0
            });
        }
    }
    
    while (minHeap.size() > 0) {
        const element = minHeap.pop();
        result.push(element.value);
        
        // Add next element from same array if exists
        if (element.elementIndex + 1 < arrays[element.arrayIndex].length) {
            const nextValue = arrays[element.arrayIndex][element.elementIndex + 1];
            minHeap.push({
                value: nextValue,
                arrayIndex: element.arrayIndex,
                elementIndex: element.elementIndex + 1
            });
        }
    }
    
    return result;
}

C#:

using System;
using System.Collections.Generic;

public class Solution {
    /// <summary>
    /// Merge k sorted arrays using min-heap
    /// Time: O(n log k) where n is total elements
    /// Space: O(k) for heap + O(n) for result
    /// </summary>
    public int[] MergeKSortedArrays(int[][] arrays) {
        if (arrays == null || arrays.Length == 0) {
            return new int[0];
        }
        
        var result = new List<int>();
        var minHeap = new PriorityQueue<ArrayElement, int>();
        
        // Initialize heap with first element from each non-empty array
        for (int i = 0; i < arrays.Length; i++) {
            if (arrays[i].Length > 0) {
                var element = new ArrayElement(arrays[i][0], i, 0);
                minHeap.Enqueue(element, element.Value);
            }
        }
        
        while (minHeap.Count > 0) {
            var element = minHeap.Dequeue();
            result.Add(element.Value);
            
            // Add next element from same array if exists
            if (element.ElementIndex + 1 < arrays[element.ArrayIndex].Length) {
                int nextValue = arrays[element.ArrayIndex][element.ElementIndex + 1];
                var nextElement = new ArrayElement(nextValue, element.ArrayIndex, element.ElementIndex + 1);
                minHeap.Enqueue(nextElement, nextElement.Value);
            }
        }
        
        return result.ToArray();
    }
    
    public class ArrayElement {
        public int Value { get; }
        public int ArrayIndex { get; }
        public int ElementIndex { get; }
        
        public ArrayElement(int value, int arrayIndex, int elementIndex) {
            Value = value;
            ArrayIndex = arrayIndex;
            ElementIndex = elementIndex;
        }
    }
}

Approach 2: Divide and Conquer (Alternative)

Algorithm Explanation:

  1. Recursively merge pairs of arrays
  2. Continue until only one merged array remains
  3. Use simple two-array merge for base case

Implementation:

Python:

def merge_k_sorted_arrays_dc(arrays):
    """
    Merge k sorted arrays using divide and conquer
    Time: O(n log k)
    Space: O(n log k) for recursion
    """
    if not arrays:
        return []
    if len(arrays) == 1:
        return arrays[0]
    
    def merge_two_arrays(arr1, arr2):
        result = []
        i = j = 0
        
        while i < len(arr1) and j < len(arr2):
            if arr1[i] <= arr2[j]:
                result.append(arr1[i])
                i += 1
            else:
                result.append(arr2[j])
                j += 1
        
        result.extend(arr1[i:])
        result.extend(arr2[j:])
        return result
    
    def merge_helper(arrays, start, end):
        if start == end:
            return arrays[start]
        
        mid = (start + end) // 2
        left = merge_helper(arrays, start, mid)
        right = merge_helper(arrays, mid + 1, end)
        
        return merge_two_arrays(left, right)
    
    return merge_helper(arrays, 0, len(arrays) - 1)

Key Insights

  • Min-Heap Efficiency: Only tracks one element per array at a time
  • Space Optimization: Heap size is O(k), not O(n)
  • Element Tracking: Need to track array index and element index
  • Empty Array Handling: Skip empty arrays during initialization

Edge Cases

  1. Empty Input: arrays = [] → Return []
  2. Single Array: arrays = [[1,2,3]] → Return [1,2,3]
  3. Empty Arrays: Arrays containing empty subarrays
  4. All Empty: All arrays are empty → Return []
  5. Different Lengths: Arrays of varying sizes
  6. Duplicate Values: Same values across different arrays

Test Cases

def test_merge_k_sorted_arrays():
    # Basic case
    arrays1 = [[1,4,5],[1,3,4],[2,6]]
    result1 = merge_k_sorted_arrays(arrays1)
    assert result1 == [1,1,2,3,4,4,5,6]
    
    # Sequential arrays
    arrays2 = [[1,2,3],[4,5,6],[7,8,9]]
    result2 = merge_k_sorted_arrays(arrays2)
    assert result2 == [1,2,3,4,5,6,7,8,9]
    
    # With empty array
    arrays3 = [[], [1], [1,2]]
    result3 = merge_k_sorted_arrays(arrays3)
    assert result3 == [1,1,2]
    
    # Single array
    arrays4 = [[1,2,3,4,5]]
    result4 = merge_k_sorted_arrays(arrays4)
    assert result4 == [1,2,3,4,5]
    
    # All empty arrays
    arrays5 = [[], [], []]
    result5 = merge_k_sorted_arrays(arrays5)
    assert result5 == []
    
    # Negative numbers
    arrays6 = [[-2,1,3], [-1,0,2]]
    result6 = merge_k_sorted_arrays(arrays6)
    assert result6 == [-2,-1,0,1,2,3]
    
    print("All tests passed!")

test_merge_k_sorted_arrays()

Interview Tips

  1. Start with Heap Approach: Most intuitive and optimal solution
  2. Explain Element Tracking: Show how to track array and element indices
  3. Handle Edge Cases: Empty arrays, single array, all empty
  4. Discuss Alternatives: Mention divide-and-conquer approach
  5. Time Complexity Analysis: Explain why it’s O(n log k)

Follow-up Questions

  1. Merge K Sorted Lists: Adapt for linked lists instead of arrays
  2. Find Kth Smallest: Find kth smallest element without full merge
  3. Streaming Input: Handle arrays that arrive one by one
  4. Memory Constraint: What if result array can’t fit in memory?
  5. Duplicate Handling: How to handle or count duplicates?

Common Mistakes

  1. Wrong Heap Size: Not understanding that heap size is O(k), not O(n)
  2. Index Management: Incorrectly tracking array and element indices
  3. Empty Array Handling: Not skipping empty arrays during initialization
  4. Heap Element Structure: Not properly structuring heap elements
  5. Result Building: Inefficient result array construction

Concept Explanations

Min-Heap for Merging: The key insight is that we only need to track the smallest unprocessed element from each array. A min-heap naturally maintains this property, allowing us to always extract the globally smallest remaining element.

Why O(n log k): Each of the n total elements is inserted and extracted from the heap exactly once. Each heap operation takes O(log k) time since the heap size is at most k.

When to Apply: Use this pattern for merging multiple sorted sequences, finding k smallest elements across multiple sorted arrays, or any scenario where you need to maintain a sorted order while processing multiple sorted inputs simultaneously.