Taro Logo

Sparse Matrix Multiplication

Medium
Pinterest logo
Pinterest
3 views
Topics:
Arrays

Given two sparse matrices mat1 of size m x k and mat2 of size k x n, return the result of mat1 x mat2. You may assume that multiplication always possible.

Example 1:

Input: mat1 = [[1,0,0],[-1,0,3]], mat2 = [[7,0,0],[0,0,0],[0,0,1]]
Output: [[7,0,0],[-7,0,3]]

Example 2:

Input: mat1 = [[0]], mat2 = [[0]]
Output: [[0]]

Constraints:

  • m == mat1.length
  • n == mat2[i].length
  • k == mat1[i].length == mat2.length
  • 1 <= m, n, k <= 200
  • -100 <= mat1[i][j], mat2[i][j] <= 100

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the dimensions of `mat1` and `mat2`, and what are the maximum possible values for these dimensions? Specifically, what is the maximum row/column count?
  2. What data type will the elements of `mat1` and `mat2` be (e.g., integers, floating-point numbers)? What is the range of possible values for these elements, and can they be negative, zero, or null?
  3. Are there any constraints on the sparsity of the matrices? Should I expect a specific ratio of zero to non-zero elements?
  4. If the matrices are incompatible for multiplication (i.e., the number of columns in `mat1` does not equal the number of rows in `mat2`), what should the function return? Should it return an empty matrix, null, or throw an exception?
  5. Can I assume the input matrices are valid (e.g., rectangular, properly formatted)? Or should I include error handling for malformed input matrices?

Brute Force Solution

Approach

The brute force approach to multiplying sparse matrices involves checking every possible combination of elements. We will compute each element in the result matrix by iterating through all possible combinations of rows from the first matrix and columns from the second matrix. It is like trying every possible path to find the solution.

Here's how the algorithm would work step-by-step:

  1. For each position in the result matrix, imagine calculating its value.
  2. To find that value, consider every single element in the corresponding row of the first matrix.
  3. For each of those elements, look at the corresponding element in the corresponding column of the second matrix.
  4. Multiply those two elements together.
  5. Keep doing that multiplication for every pair of elements from the row and column.
  6. Finally, add up all those multiplication results to get the final value for the position in the result matrix.
  7. Repeat this entire process for every single position in the result matrix.

Code Implementation

def sparse_matrix_multiplication_brute_force(first_matrix, second_matrix):
    first_matrix_row_count = len(first_matrix)
    first_matrix_column_count = len(first_matrix[0])
    second_matrix_column_count = len(second_matrix[0])

    # Result matrix dimensions are rows of first and cols of second
    result_matrix = [[0] * second_matrix_column_count for _ in range(first_matrix_row_count)]

    for row_index in range(first_matrix_row_count):
        for column_index in range(second_matrix_column_count):
            # Need to iterate through the common dimension to calculate result[row][col]
            for k_index in range(first_matrix_column_count):
                result_matrix[row_index][column_index] += first_matrix[row_index][k_index] * second_matrix[k_index][column_index]

    return result_matrix

Big(O) Analysis

Time Complexity
O(m * n * k)The brute force approach iterates through each cell of the resulting matrix, which has dimensions m x n, where m is the number of rows in the first matrix and n is the number of columns in the second matrix. For each cell in the result, we iterate through the corresponding row in the first matrix and the corresponding column in the second matrix to perform element-wise multiplication and summation. The length of these rows and columns is determined by k, the number of columns in the first matrix (which must equal the number of rows in the second matrix). Thus, computing each element in the result matrix requires O(k) operations, and since we do this for every element in the m x n resulting matrix, the overall time complexity is O(m * n * k).
Space Complexity
O(M * N)The brute force approach calculates each element of the resulting matrix. The resulting matrix has dimensions M rows (from the first matrix) and N columns (from the second matrix, assuming compatible matrix sizes). Thus, it needs to store M * N elements. Therefore, the auxiliary space required to store the result matrix is proportional to M * N, where M is the number of rows in the first matrix, and N is the number of columns in the second matrix. This space is used regardless of the sparsity of the input matrices.

Optimal Solution

Approach

When multiplying matrices where many values are zero, we want to skip unnecessary calculations. The key is to only perform calculations where both corresponding elements in the rows of the first matrix and the columns of the second matrix are non-zero.

Here's how the algorithm would work step-by-step:

  1. Think about matrix multiplication: each element in the result matrix is a sum of products.
  2. Notice that if any of the numbers being multiplied together is zero, that product is zero, and doesn't change the sum.
  3. Instead of calculating every possible product, focus only on positions where both values being multiplied are not zero.
  4. For each row in the first matrix and each column in the second matrix, find where the row has a non-zero number and the column also has a non-zero number in the same position.
  5. When you find such a match, calculate the product and add it to the correct spot in the result matrix.
  6. If you don't find any matches for a particular row and column, the value in the result matrix at that location remains zero.

Code Implementation

def multiply_sparse_matrices(first_matrix, second_matrix):
    number_of_rows_first = len(first_matrix)
    number_of_columns_first = len(first_matrix[0])
    number_of_columns_second = len(second_matrix[0])

    result_matrix = [[0] * number_of_columns_second for _ in range(number_of_rows_first)]

    # Iterate through rows of the first matrix
    for row_index in range(number_of_rows_first):
        # Iterate through columns of the second matrix
        for column_index in range(number_of_columns_second):
            # Inner dimension, k, also the column of matrix A and row of matrix B
            for inner_dimension_index in range(number_of_columns_first):
                # Only proceed if both elements are non-zero to save computation
                if first_matrix[row_index][inner_dimension_index] != 0 and \
                   second_matrix[inner_dimension_index][column_index] != 0:

                    # Add to the result only when both are non-zero
                    product = first_matrix[row_index][inner_dimension_index] * second_matrix[inner_dimension_index][column_index]
                    result_matrix[row_index][column_index] += product

    return result_matrix

Big(O) Analysis

Time Complexity
O(m * n * k)Let m be the number of rows in the first matrix (A), n be the number of columns in the second matrix (B), and k be the number of columns in A (which is also the number of rows in B). The algorithm iterates through each cell of the result matrix (m * n). For each cell in the result matrix, it iterates through the 'k' common dimension to find matching non-zero elements in the corresponding row of A and the corresponding column of B. In the worst case, for each of the m * n elements of the result matrix, we potentially iterate through all k elements of the common dimension. Therefore, the total number of operations approximates m * n * k, which simplifies to O(m * n * k).
Space Complexity
O(R * C)The algorithm creates a result matrix to store the product of the two sparse matrices. The size of this result matrix is determined by the number of rows (R) in the first matrix and the number of columns (C) in the second matrix. Therefore, the auxiliary space required is proportional to the product of these dimensions. In the worst case, this result matrix could be fully populated. Hence, the space complexity is O(R * C).

Edge Cases

CaseHow to Handle
mat1 or mat2 is nullReturn null or throw an IllegalArgumentException to indicate invalid input.
mat1 or mat2 is an empty matrix (0 rows or 0 columns)Return an empty matrix with appropriate dimensions or throw an IllegalArgumentException if the dimensions are incompatible.
Incompatible dimensions for multiplication (number of columns in mat1 != number of rows in mat2)Return an empty matrix or throw an IllegalArgumentException to signal an invalid operation.
Matrices with large dimensions that could lead to integer overflow during multiplication.Use long data type to store intermediate products to prevent overflow, and check for possible long to int conversion issues.
Matrices with a very large number of zero elements, where a dense matrix representation may be inefficientUse a sparse matrix representation (e.g., a dictionary or list of coordinates and values) to store and compute the product efficiently.
One or both matrices contain negative numbers.The multiplication logic should handle negative numbers correctly, ensuring the sign of the result is accurate.
Matrices with extreme values (Integer.MAX_VALUE, Integer.MIN_VALUE).Handle potential overflow during multiplication of extreme values, similar to the large dimensions case.
One of the matrices has only one row or one column.The algorithm should correctly handle these cases, ensuring correct indexing and multiplication.