Taro Logo

Median of Two Sorted Arrays

Hard
Apple logo
Apple
1 view
Topics:
ArraysBinary Search

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

The overall run time complexity should be O(log (m+n)).

For example:

  1. nums1 = [1, 3], nums2 = [2] The merged array is [1, 2, 3] and the median is 2.0.

  2. nums1 = [1, 2], nums2 = [3, 4] The merged array is [1, 2, 3, 4] and the median is (2 + 3) / 2 = 2.5.

Could you please implement a function that takes two sorted arrays as input and returns the median of the combined sorted array? Ensure your solution has a time complexity of O(log(m+n)). What are the edge cases to consider, and how would you handle them?

Solution


Naive Solution

A straightforward approach is to merge the two sorted arrays into a single sorted array and then find the median. This involves creating a new array, copying elements from both input arrays while maintaining the sorted order, and finally, calculating the median based on the length of the merged array.

Algorithm:

  1. Create a new array mergedArray with a size equal to the sum of the lengths of nums1 and nums2.
  2. Iterate through nums1 and nums2, merging the elements into mergedArray in sorted order.
  3. If the length of mergedArray is odd, the median is the middle element.
  4. If the length of mergedArray is even, the median is the average of the two middle elements.

Code (Python):

def findMedianSortedArrays_naive(nums1, nums2):
    merged_array = sorted(nums1 + nums2)
    n = len(merged_array)
    if n % 2 == 0:
        median = (merged_array[n // 2 - 1] + merged_array[n // 2]) / 2
    else:
        median = merged_array[n // 2]
    return median

Time Complexity: O(m + n) due to the merging and sorting operation.

Space Complexity: O(m + n) because a new array of size m + n is created.

Optimal Solution

To achieve O(log (m+n)) runtime, we need to use a binary search approach. The key idea is to find the correct partition in both arrays such that all elements to the left of the partition are smaller than or equal to all elements to the right of the partition. This ensures that we've effectively found the median.

Algorithm:

  1. Ensure nums1 is the shorter array to optimize binary search (swap if necessary).
  2. Perform binary search on the shorter array.
  3. In each iteration, calculate the partition points in both arrays.
  4. Check if the partition is correct by comparing the elements around the partition.
  5. If the partition is correct, calculate the median based on whether the total length is even or odd.
  6. If the partition is not correct, adjust the binary search range accordingly.

Edge Cases:

  • Empty arrays: Handle cases where one or both arrays are empty.
  • Arrays of different lengths: The binary search should be performed on the shorter array.
  • Partition reaching the boundary of the array: Handle cases where the partition is at the beginning or end of an array.

Code (Python):

def findMedianSortedArrays(nums1, nums2):
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1

    m, n = len(nums1), len(nums2)
    low, high = 0, m

    while low <= high:
        partitionX = (low + high) // 2
        partitionY = (m + n + 1) // 2 - partitionX

        maxLeftX = nums1[partitionX - 1] if partitionX > 0 else float('-inf')
        minRightX = nums1[partitionX] if partitionX < m else float('inf')

        maxLeftY = nums2[partitionY - 1] if partitionY > 0 else float('-inf')
        minRightY = nums2[partitionY] if partitionY < n else float('inf')

        if maxLeftX <= minRightY and maxLeftY <= minRightX:
            if (m + n) % 2 == 0:
                return (max(maxLeftX, maxLeftY) + min(minRightX, minRightY)) / 2
            else:
                return max(maxLeftX, maxLeftY)
        elif maxLeftX > minRightY:
            high = partitionX - 1
        else:
            low = partitionX + 1

    return None  # Should not reach here

Time Complexity: O(log(min(m, n))) due to the binary search on the shorter array.

Space Complexity: O(1) as we use a constant amount of extra space.