Taro Logo

Median of a Row Wise Sorted Matrix

Medium
Asked by:
Profile picture
1 view
Topics:
ArraysBinary Search

Given a row x col binary matrix matrix, where each row is sorted in non-decreasing order, return the median of all elements in the matrix.

Example 1:

Input: matrix = [[1,1,2],[1,2,3],[1,3,3]]
Output: 2
Explanation: There are 3 rows with 3 elements each, so the median is the 5th element. The two 1s, two 2s and one 3 are the first 5 elements, so the median is 2.

Example 2:

Input: matrix = [[1,1,2],[1,2,2],[1,3,3]]
Output: 2

Constraints:

  • row == matrix.length
  • col == matrix[i].length
  • 1 <= row, col <= 500
  • 1 <= matrix[i][j] <= 106
  • Each row in matrix is sorted in non-decreasing order.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the dimensions (m and n) of the matrix, and what is the range of integer values within the matrix?
  2. Can the matrix be empty (m=0 or n=0)? If so, what value should I return?
  3. If the total number of elements (m * n) is even, should I return the smaller of the two middle elements, the larger, or their average?
  4. Are all rows guaranteed to be non-empty, or could some rows have zero columns?
  5. Is there a guarantee that the matrix will always be rectangular (i.e., all rows have the same number of columns)?

Brute Force Solution

Approach

To find the middle value in a set of sorted rows, the brute force approach treats the whole matrix like one big unsorted list. It simply gathers all the numbers together and then finds the middle one after sorting them.

Here's how the algorithm would work step-by-step:

  1. First, imagine taking all the numbers from the matrix and putting them into one single group.
  2. Next, we need to organize this group of numbers from smallest to largest.
  3. Once we have them in order, we can find the middle number. If there's an odd number of numbers, it's simply the one in the exact middle. If there's an even number of numbers, we take the average of the two numbers closest to the middle.

Code Implementation

def find_median_brute_force(matrix): 
    all_numbers = []

    # Gather all numbers from the matrix into a single list
    for row in matrix:
        for number in row:
            all_numbers.append(number)

    all_numbers.sort()

    matrix_size = len(all_numbers)

    # Determine if the matrix has an odd or even number of elements
    if matrix_size % 2 != 0:
        # Odd number of elements, return the middle element
        median_index = matrix_size // 2
        return all_numbers[median_index]
    else:
        # Even number of elements, calculate the average of the two middle elements

        middle_index_one = matrix_size // 2 - 1
        middle_index_two = matrix_size // 2

        # Need to calculate average of two middle numbers
        median = (all_numbers[middle_index_one] + all_numbers[middle_index_two]) / 2
        return median

Big(O) Analysis

Time Complexity
O(m*n log(m*n))The algorithm first copies all m*n elements of the matrix into a single array. This takes O(m*n) time. Then, the algorithm sorts this array using a comparison-based sorting algorithm, which typically takes O(N log N) time, where N is the number of elements to be sorted. In this case, N = m*n. Therefore, the sorting step takes O(m*n log(m*n)) time. The final step of finding the median takes constant time, O(1). Therefore, the overall time complexity is dominated by the sorting step, resulting in O(m*n log(m*n)).
Space Complexity
O(N)The brute force approach described first gathers all elements from the matrix into a single group, which implies creating a new data structure (like an array or list) to store these elements. If the matrix has R rows and C columns, the total number of elements is R * C, which we can denote as N. Therefore, this new data structure will require space proportional to N. Sorting this data structure is generally done in-place (e.g., with heap sort) or may require O(N) space (e.g., with merge sort). Combining the initial storage and potential sorting auxiliary space, the space complexity is O(N).

Optimal Solution

Approach

The best way to find the middle value in a matrix where each row is already sorted is to use a process of elimination. We repeatedly narrow down the possible range of values until we pinpoint the median. This is much faster than looking at every single number in the matrix.

Here's how the algorithm would work step-by-step:

  1. Find the smallest and largest numbers in the entire matrix. These will be your starting boundaries for the possible range of the median.
  2. Calculate the middle value within this range (the average of the smallest and largest numbers).
  3. For each row in the matrix, count how many numbers are smaller than or equal to this middle value.
  4. Add up these counts from all rows. This tells you how many numbers in the entire matrix are less than or equal to our middle value guess.
  5. If this count is less than half the total number of elements in the matrix, it means our middle value is too small. Adjust the smallest boundary to be just above the middle value.
  6. If the count is greater than or equal to half the total number of elements, it means our middle value is too big (or is the median). Adjust the largest boundary to be the middle value.
  7. Repeat steps 2-6 until the smallest and largest boundaries are very close to each other. At that point, you've found the median.

Code Implementation

def find_median_row_wise_sorted_matrix(matrix):
    number_of_rows = len(matrix)
    number_of_columns = len(matrix[0])
    minimum_value = matrix[0][0]
    maximum_value = matrix[0][number_of_columns - 1]
    for row in range(number_of_rows):
        minimum_value = min(minimum_value, matrix[row][0])
        maximum_value = max(maximum_value, matrix[row][number_of_columns - 1])

    desired_count = (number_of_rows * number_of_columns + 1) // 2

    while minimum_value < maximum_value:
        middle_value = minimum_value + (maximum_value - minimum_value) // 2
        count = 0

        # Counting elements <= middle_value for each row
        for row in range(number_of_rows):
            left_pointer = 0
            right_pointer = number_of_columns

            while left_pointer < right_pointer:
                mid = left_pointer + (right_pointer - left_pointer) // 2
                if matrix[row][mid] <= middle_value:
                    left_pointer = mid + 1
                else:
                    right_pointer = mid
            count += left_pointer

        # Adjust search space based on the count
        if count < desired_count:
        	# Middle value is too small
            minimum_value = middle_value + 1

        else:
        	# Middle value is too large or the median
            maximum_value = middle_value

    return minimum_value

Big(O) Analysis

Time Complexity
O(N * M * log(MAX-MIN))The algorithm uses binary search to find the median. The outer loop is a binary search between the minimum and maximum possible values in the matrix, resulting in log(MAX-MIN) iterations, where MAX and MIN are the largest and smallest values in the matrix respectively. Inside the binary search loop, we iterate through each row (N rows) of the matrix. For each row, we perform a binary search (M is the number of columns which are also the number of elements in a row, since it's a row wise sorted matrix) to find the number of elements less than or equal to the current middle value. Therefore, the overall time complexity is O(N * log(M) * log(MAX-MIN)). Since log(M) is generally smaller than M and to better reflect the algorithm, it can be more accurately and simply shown as O(N * M * log(MAX-MIN)), because within the binary search, we can loop all elements instead of using log(M).
Space Complexity
O(1)The algorithm primarily uses variables to store the minimum and maximum values of the matrix elements, as well as a counter within each row. It does not create auxiliary data structures like arrays or hash maps that scale with the input size. The space occupied by these variables remains constant irrespective of the matrix dimensions. Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Null or empty matrixReturn appropriate value based on problem definition (e.g., null, -1, throw exception), after checking if the matrix is null or any of its dimensions are zero.
Matrix with one elementReturn that single element as the median.
Matrix with a large number of rows and columns causing integer overflowUse long or appropriate data type to prevent integer overflow when calculating the total number of elements (m * n).
Matrix with all elements being the same valueThe binary search will still converge to the correct (and only) median value.
Matrix with negative numbersThe algorithm works correctly with negative numbers as it relies on the sorted nature of the rows and binary search.
Matrix with very large positive or negative numbers approaching integer limitsThe algorithm works correctly as long as the comparison operators are well-defined for these large numbers within the language.
Matrix where the median falls between two numbers (even total elements) - only applicable if the prompt asks for ceiling or floor if non-integerThe problem asks to return the middle element when all elements are sorted, thus we search directly for this middle element by count so this is not a concern.
Rows with significantly varying lengths (though problem states rows are sorted and asks for median of all, assume complete matrix)Since the problem definition assumes a complete matrix, this edge case is not applicable but otherwise requires appropriate padding or special handling during count calculation if the assumption were not met.