Median of Two Sorted Arrays Interview Question for Apple

# Finding the Median of Two Sorted Arrays

## Problem Description

Given two sorted arrays `nums1` and `nums2` of size `m` and `n` respectively, return the median of the two sorted arrays. The overall run time complexity should be `O(log (m+n))`. 

Example 1:
Input: `nums1 = [1,3], nums2 = [2]`
Output: `2.00000`
Explanation: merged array = `[1,2,3]` and median is 2.

Example 2:
Input: `nums1 = [1,2], nums2 = [3,4]`
Output: `2.50000`
Explanation: merged array = `[1,2,3,4]` and median is `(2 + 3) / 2 = 2.5`.

## Brute Force Solution

One straightforward approach is to merge the two sorted arrays into a single sorted array and then find the median. This approach has a time complexity of O(m+n) because it requires merging the two arrays, which involves iterating through both arrays once.  The space complexity is also O(m+n) because a new array to hold the merged result is created.

```python
def findMedianSortedArrays_bruteforce(nums1, nums2):
    merged = sorted(nums1 + nums2)
    n = len(merged)
    if n % 2 == 0:
        # Even length: average of the middle two elements
        return (merged[n // 2 - 1] + merged[n // 2]) / 2.0
    else:
        # Odd length: middle element
        return float(merged[n // 2])

Optimal Solution: Binary Search

To achieve the required O(log (m+n)) time complexity, a binary search approach is necessary. The core idea is to find the correct partition in both arrays such that all elements to the left of the partition are smaller than all elements to the right of the partition. This guarantees that the median can be easily computed from the partition points.

Here's the algorithm:

Ensure nums1 is the smaller array: If nums1 is longer than nums2, swap them to ensure nums1 is the shorter array. This simplifies the binary search logic.
Binary Search: Perform binary search on the smaller array (nums1). Let low = 0 and high = len(nums1). In each iteration, calculate partitionX and partitionY. partitionX is the point to partition nums1 and partitionY is computed such that the total number of elements on the left side of both partitions equals (m + n + 1) // 2.
Check Partition Condition: Compare the maximum element on the left side of nums1's partition (maxLeftX) with the minimum element on the right side of nums2's partition (minRightY) and vice versa.
- If maxLeftX <= minRightY and maxLeftY <= minRightX, the correct partition is found.
- If maxLeftX > minRightY, move the high pointer to the left to search for a smaller partitionX.
- If maxLeftY > minRightX, move the low pointer to the right to search for a larger partitionX.
Compute Median: Once the correct partition is found, compute the median based on whether the total number of elements (m + n) is even or odd.
- If even, the median is (max(maxLeftX, maxLeftY) + min(minRightX, minRightY)) / 2.0.
- If odd, the median is max(maxLeftX, maxLeftY).

def findMedianSortedArrays(nums1, nums2):
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1
        
m = len(nums1)
n = len(nums2)
low = 0
high = m

while low <= high:
    partitionX = (low + high) // 2
    partitionY = (m + n + 1) // 2 - partitionX

    maxLeftX = nums1[partitionX - 1] if partitionX > 0 else float('-inf')
    minRightX = nums1[partitionX] if partitionX < m else float('inf')

    maxLeftY = nums2[partitionY - 1] if partitionY > 0 else float('-inf')
    minRightY = nums2[partitionY] if partitionY < n else float('inf')

    if maxLeftX <= minRightY and maxLeftY <= minRightX:
        if (m + n) % 2 == 0:
            return (max(maxLeftX, maxLeftY) + min(minRightX, minRightY)) / 2.0
        else:
            return float(max(maxLeftX, maxLeftY))
    elif maxLeftX > minRightY:
        high = partitionX - 1
    else:
        low = partitionX + 1

Big O Run-time Analysis

The dominant operation in the optimal solution is the binary search performed on the smaller array. In the worst-case scenario, binary search has a time complexity of O(log m), where m is the length of the smaller array. Since m <= n and m + n is the total number of elements, the overall time complexity is O(log (min(m, n))), which is O(log(m+n)) because the smaller of m and n can grow at most linearly with respect to the sum.

Therefore, the time complexity of the findMedianSortedArrays function is O(log (m+n)).

Big O Space Usage Analysis

The optimal solution uses a constant amount of extra space, regardless of the input array sizes. Only a few variables are used to store partition indices and maximum/minimum values. Therefore, the space complexity is O(1). The brute force approach requires O(m+n) space for the merged array, but the binary search approach does not.

Edge Cases and Handling

Empty Arrays: The code handles cases where one or both arrays are empty. If one array is empty, the median can be directly computed from the other array.
Unequal Array Lengths: The algorithm works correctly when the arrays have different lengths. The binary search is always performed on the smaller array.
All Elements in One Array Smaller than the Other: The algorithm correctly handles cases where all elements in one array are smaller than all elements in the other array.
Duplicate Elements: The presence of duplicate elements does not affect the correctness of the algorithm.
Large and Small Numbers: Using float('-inf') and float('inf') ensures the algorithm behaves correctly when handling extremely small or large numbers.