Taro Logo

Sliding Window Median

Hard
Google logo
Google
6 views
Topics:
ArraysSliding WindowsBinary Search

Problem: Sliding Window Median

Given an integer array nums and an integer k, you are tasked with finding the median of each sliding window of size k as it moves from the beginning to the end of the array. The median is defined as the middle value in an ordered list of integers. If the list has an even number of elements, the median is the average of the two middle values.

Example:

Consider the array nums = [1, 3, -1, -3, 5, 3, 6, 7] and a window size of k = 3. The sliding window moves as follows:

  1. [1, 3, -1], -3, 5, 3, 6, 7 (Median: 1)
  2. 1, [3, -1, -3], 5, 3, 6, 7 (Median: -1)
  3. 1, 3, [-1, -3, 5], 3, 6, 7 (Median: -1)
  4. 1, 3, -1, [-3, 5, 3], 6, 7 (Median: 3)
  5. 1, 3, -1, -3, [5, 3, 6], 7 (Median: 5)
  6. 1, 3, -1, -3, 5, [3, 6, 7] (Median: 6)

Therefore, the output should be [1, -1, -1, 3, 5, 6]. Note that in a real coding problem floating-point values may be expected.

Task:

Write a function that takes an integer array nums and an integer k as input, and returns an array containing the medians of each sliding window.

Constraints:

  • 1 <= k <= nums.length <= 10^5
  • -2^31 <= nums[i] <= 2^31 - 1

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected range of values in the `nums` array, and what is the maximum size of the array and the window size `k`?
  2. What should be returned if the input array `nums` is empty or if the window size `k` is greater than the size of the array?
  3. How should the median be calculated when the window size `k` is even?
  4. Are the input values integers only, or could they be floating-point numbers?
  5. Can I assume that `k` will always be a positive integer greater than zero?

Brute Force Solution

Approach

We're trying to find the middle number in a sliding group of numbers. The most basic way to do this is to look at every possible group of numbers in the larger list. For each group, we find the middle number by sorting it and picking the element in the middle.

Here's how the algorithm would work step-by-step:

  1. First, consider the first set of numbers that form a group (the size of the group is predetermined).
  2. To find the middle number of this group, arrange the numbers from smallest to largest.
  3. Then, pick the number that's in the middle position. This is the median for the first group.
  4. Now, slide the group over by one position to consider the next set of numbers.
  5. Repeat the process of arranging the new group's numbers from smallest to largest.
  6. Again, find the number in the middle position of this newly sorted group. This is the median for this new group.
  7. Continue sliding the group, sorting the numbers in each group, and finding the middle number, until you've gone through all possible groups of numbers.
  8. The list of all these middle numbers (medians) is your final answer.

Code Implementation

def sliding_window_median_brute_force(numbers, window_size):
    median_list = []
    # Iterate through the array, creating each window
    for i in range(len(numbers) - window_size + 1):
        window = numbers[i:i + window_size]

        # Sort the window to easily find the median.
        sorted_window = sorted(window)

        # Determine if window size is even or odd
        if window_size % 2 == 0:
            median = (sorted_window[window_size // 2 - 1] + sorted_window[window_size // 2]) / 2
        else:
            # Index for middle of list is different for odd and even
            median = sorted_window[window_size // 2]

        median_list.append(median)

    return median_list

Big(O) Analysis

Time Complexity
O(n*k log k)The algorithm iterates through a sliding window of size k across the input array of size n. For each window, the algorithm sorts the k elements to find the median. Sorting k elements takes O(k log k) time. Since the sliding window moves n-k+1 times, the overall time complexity is (n-k+1) * O(k log k). In the worst-case scenario, when k is proportional to n, the complexity becomes O(n * n log n). However, if we treat k as a constant relative to n, it simplifies to O(n*k log k).
Space Complexity
O(k)The algorithm creates a sorted copy of each sliding window to find the median. Since the window size is 'k', a temporary array of size 'k' is required for each window to store the sorted numbers. While the window slides, the space is reused, but we still need a space of size k to perform the sort to find the median. Therefore, the auxiliary space used by the algorithm is proportional to the window size 'k'.

Optimal Solution

Approach

To efficiently find the median within a moving window, we use two balanced data structures to keep track of the smaller and larger halves of the numbers within the window. As the window slides, we maintain the balance and quickly identify the median without re-sorting every time.

Here's how the algorithm would work step-by-step:

  1. Maintain two sorted sets, one to store the smaller half of the window's numbers and the other to store the larger half.
  2. When the window slides, remove the element that's no longer in the window.
  3. Add the new element that enters the window.
  4. After each slide, rebalance the two sets. The smaller set should have the same number of elements as the larger set, or one more element if the window size is odd.
  5. The median is then either the largest element in the smaller set (if the window size is odd) or the average of the largest element in the smaller set and the smallest element in the larger set (if the window size is even).

Code Implementation

from sortedcontainers import SortedList

def sliding_window_median(numbers, window_size):
    result = []
    smaller_half = SortedList()
    larger_half = SortedList()

    for i in range(len(numbers)):
        # Remove the leftmost element if the window is full.
        if i >= window_size:
            element_to_remove = numbers[i - window_size]
            if element_to_remove in smaller_half:
                smaller_half.remove(element_to_remove)
            else:
                larger_half.remove(element_to_remove)

        # Add the current element to the appropriate half.
        if not smaller_half or numbers[i] <= smaller_half[-1]:
            smaller_half.add(numbers[i])
        else:
            larger_half.add(numbers[i])

        # Rebalance the two halves.
        while len(smaller_half) > len(larger_half) + (window_size % 2):
            larger_half.add(smaller_half.pop())

        while len(larger_half) > len(smaller_half):
            # Maintain balance between the two sets.
            smaller_half.add(larger_half.pop(0))

        if i >= window_size - 1:
            # Calculate the median for the current window.
            if window_size % 2 == 0:
                median = (smaller_half[-1] + larger_half[0]) / 2.0
            else:
                median = smaller_half[-1]
            result.append(median)

    return result

Big(O) Analysis

Time Complexity
O(n log k)We iterate through the input array of size n, effectively sliding the window n times. Within each iteration (each window), we perform operations on the sorted sets. Specifically, adding and removing elements from the sorted sets takes O(log k) time, where k is the window size, due to the logarithmic nature of balanced tree operations. Rebalancing also involves a constant number of these logarithmic operations. Therefore, each window operation costs O(log k), and since we have n windows, the overall time complexity is O(n log k).
Space Complexity
O(k)The solution maintains two sorted sets to store the smaller and larger halves of the numbers within the window. These sorted sets hold elements from the current window, which contains at most k elements, where k is the window size. Therefore, the space required for these sets is proportional to k. No other significant auxiliary space is used, so the total space complexity is O(k).

Edge Cases

CaseHow to Handle
Empty input array (nums is null or has length 0)Return an empty array immediately since there are no windows to process.
k is zero or negativeThrow an IllegalArgumentException because a window size must be positive.
k is greater than the length of numsReturn an empty array or an array with a single median equal to the median of the entire nums array if k > nums.length.
nums contains only one element and k is 1Return an array containing the single element as the median.
Large input array size with a large k, potentially leading to performance issuesUse efficient data structures like ordered sets (e.g., TreeSet or Priority Queues) to maintain the sliding window, ensuring logarithmic time complexity for insertion, deletion, and median retrieval.
nums contains duplicate values within a windowThe ordered set/priority queue based solution correctly handles duplicates as each instance is treated individually.
nums contains negative numbers or extreme boundary values (Integer.MIN_VALUE, Integer.MAX_VALUE)The solution using comparators in ordered sets or priority queues correctly handles negative numbers and extreme values without integer overflow issues during comparisons.
Input array with integer values very close to each other (e.g., with floating-point precision issues if dividing for the median)Avoid explicit division when k is even, instead take the average of the two middle elements to prevent potential floating point errors; use double to represent medians.