Given two sorted arrays nums1
and nums2
of size m
and n
respectively, return the median of the two sorted arrays.
The overall run time complexity should be O(log (m+n))
.
Example 1:
Input: nums1 = [1,3], nums2 = [2]
Output: 2.00000
Explanation: merged array = [1,2,3] and median is 2.
Example 2:
Input: nums1 = [1,2], nums2 = [3,4]
Output: 2.50000
Explanation: merged array = [1,2,3,4] and median is (2 + 3) / 2 = 2.5.
Constraints:
nums1.length == m
nums2.length == n
0 <= m <= 1000
0 <= n <= 1000
1 <= m + n <= 2000
-10^6 <= nums1[i], nums2[i] <= 10^6
Can you implement an algorithm that satisfies the specified time complexity?
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach for finding the median combines two sorted lists and then finds the middle element. It's like merging two decks of cards and then picking the middle card. This approach doesn't try to be efficient; it just does the most obvious thing.
Here's how the algorithm would work step-by-step:
def find_median_sorted_arrays_brute_force(first_array, second_array):
# Combine the two input arrays into a single array
combined_array = first_array + second_array
# Sort the combined array
combined_array.sort()
combined_array_length = len(combined_array)
# Determine if the array has an even or odd number of elements
if combined_array_length % 2 == 0:
# Calculate indices of the two middle elements for even length array
middle_index_one = combined_array_length // 2 - 1
middle_index_two = combined_array_length // 2
# Calculate the median as average of the two middle elements
median = (combined_array[middle_index_one] + combined_array[middle_index_two]) / 2
else:
# Calculate the index of the middle element for odd length array
middle_index = combined_array_length // 2
# Median is simply the middle element
median = combined_array[middle_index]
return median
The fastest way to find the middle value of two sorted lists combined is to efficiently narrow down the search area. We strategically eliminate halves of the lists that cannot contain the median, similar to a guessing game where you adjust your range with each guess.
Here's how the algorithm would work step-by-step:
def find_median_sorted_arrays(first_array, second_array):
if len(first_array) > len(second_array):
first_array, second_array = second_array, first_array
array_x_length = len(first_array)
array_y_length = len(second_array)
low_index = 0
high_index = array_x_length
while low_index <= high_index:
partition_x = (low_index + high_index) // 2
# Calculate partition of the second array
partition_y = (array_x_length + array_y_length + 1) // 2 - partition_x
max_left_x = first_array[partition_x - 1] if partition_x > 0 else float('-inf')
min_right_x = first_array[partition_x] if partition_x < array_x_length else float('inf')
max_left_y = second_array[partition_y - 1] if partition_y > 0 else float('-inf')
min_right_y = second_array[partition_y] if partition_y < array_y_length else float('inf')
# Check if partitions are correct
if max_left_x <= min_right_y and max_left_y <= min_right_x:
# We found the correct partition
if (array_x_length + array_y_length) % 2 == 0:
return (max(max_left_x, max_left_y) + min(min_right_x, min_right_y)) / 2
else:
return max(max_left_x, max_left_y)
# Adjust partitions based on comparisons
elif max_left_x > min_right_y:
# Move towards left in first_array
high_index = partition_x - 1
else:
# Move towards right in first_array
low_index = partition_x + 1
return None # Should never reach here, input arrays are sorted
Case | How to Handle |
---|---|
Both arrays are empty | Return an appropriate error value or throw an exception, as the median is undefined for empty arrays. |
One array is empty, the other is not | The median is simply the median of the non-empty array, which can be calculated directly. |
Arrays of size 1 and size > 1 | The algorithm should correctly merge a very small array with a significantly larger one and identify the median element(s). |
Arrays contain duplicate numbers | The binary search partitioning logic should correctly handle duplicates without entering infinite loops or miscalculating the median. |
Arrays contain negative numbers, zero, and positive numbers | The comparison logic should work correctly for all number types including negative numbers and zero. |
Arrays with large numbers that may cause integer overflow during calculations (e.g., sums) | Ensure that the calculations used for the median (especially when averaging two values) are performed using a data type that can accommodate large numbers, such as long or double, to avoid overflow. |
Arrays with all identical values | Binary search must handle identical values within the array without getting stuck in an infinite loop. |
Arrays are highly skewed (e.g., one array has very small values and the other has very large values) | Binary search should still converge to the correct partition even with disparate value ranges. |