Taro Logo

Max Dot Product of Two Subsequences

Hard
Asked by:
Profile picture
Profile picture
Profile picture
10 views
Topics:
ArraysDynamic Programming

Given two arrays nums1 and nums2.

Return the maximum dot product between non-empty subsequences of nums1 and nums2 with the same length.

A subsequence of a array is a new array which is formed from the original array by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, [2,3,5] is a subsequence of [1,2,3,4,5] while [1,5,3] is not).

Example 1:

Input: nums1 = [2,1,-2,5], nums2 = [3,0,-6]
Output: 18
Explanation: Take subsequence [2,-2] from nums1 and subsequence [3,-6] from nums2.
Their dot product is (2*3 + (-2)*(-6)) = 18.

Example 2:

Input: nums1 = [3,-2], nums2 = [2,-6,7]
Output: 21
Explanation: Take subsequence [3] from nums1 and subsequence [7] from nums2.
Their dot product is (3*7) = 21.

Example 3:

Input: nums1 = [-1,-1], nums2 = [1,1]
Output: -1
Explanation: Take subsequence [-1] from nums1 and subsequence [1] from nums2.
Their dot product is -1.

Constraints:

  • 1 <= nums1.length, nums2.length <= 500
  • -1000 <= nums1[i], nums2[i] <= 1000

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the input arrays contain negative numbers?
  2. What are the size constraints for the input arrays `nums1` and `nums2`?
  3. Is an empty subsequence a valid subsequence? If so, what should the dot product be in that case?
  4. If there are multiple subsequences with the same maximum dot product, can I return any one of them?
  5. Can the input arrays be empty or null? If so, what should I return?

Brute Force Solution

Approach

The brute force method to find the maximum dot product between subsequences involves checking every possible combination. We need to explore all possible subsequences from both input sequences and calculate their dot product. The best (maximum) dot product is then selected from all these options.

Here's how the algorithm would work step-by-step:

  1. First, consider all possible subsequences from the first input sequence. A subsequence can be made by including or excluding each number in the sequence.
  2. Next, do the exact same thing and consider all possible subsequences from the second input sequence.
  3. Now, take one subsequence from the first input sequence and one subsequence from the second input sequence.
  4. If the subsequences have different lengths, discard this combination because their dot product cannot be calculated.
  5. If the subsequences have the same length, calculate their dot product by multiplying corresponding numbers and adding them up.
  6. Keep track of the maximum dot product found so far.
  7. Repeat steps 3-5 for all possible pairs of subsequences you've created.
  8. Finally, the maximum dot product you've kept track of is the answer.

Code Implementation

def max_dot_product_brute_force(first_array, second_array):
    max_dot_product = float('-inf')

    def get_all_subsequences(input_array):
        subsequences = [[]]
        for number in input_array:
            new_subsequences = [subsequence + [number] for subsequence in subsequences]
            subsequences.extend(new_subsequences)
        return subsequences

    first_array_subsequences = get_all_subsequences(first_array)
    second_array_subsequences = get_all_subsequences(second_array)

    for first_subsequence in first_array_subsequences:
        for second_subsequence in second_array_subsequences:
            # We only calculate dot products for subsequences of equal length.
            if len(first_subsequence) == len(second_subsequence):

                if not first_subsequence or not second_subsequence:
                  dot_product = 0
                else:
                  dot_product = sum(first_subsequence[i] * second_subsequence[i] for i in range(len(first_subsequence)))

                # Update the maximum dot product if the current dot product is greater
                max_dot_product = max(max_dot_product, dot_product)

    # Need to handle edge case if both arrays contain only negative numbers.
    if max_dot_product == 0 and (all(number < 0 for number in first_array) or all(number < 0 for number in second_array)):
        max_first = max(first_array) if first_array else float('-inf')
        max_second = max(second_array) if second_array else float('-inf')
        if max_first != float('-inf') and max_second != float('-inf'):
           max_dot_product = max(max_dot_product, max_first * max_second)
        elif max_first != float('-inf'):
            max_dot_product = max(max_dot_product, max_first)
        elif max_second != float('-inf'):
            max_dot_product = max(max_dot_product, max_second)
        else:
            max_dot_product = float('-inf')

    return max_dot_product

Big(O) Analysis

Time Complexity
O(4^n)Generating all subsequences of an array of size n takes O(2^n) time since each element can either be present or absent in a subsequence. Since we have two arrays, generating all subsequence pairs requires O(2^n * 2^n) = O(4^n) operations. Calculating the dot product for each valid subsequence pair takes O(n) time in the worst case, but this factor is dominated by the subsequence generation. Therefore, the overall time complexity is O(4^n).
Space Complexity
O(2^(N+M))The algorithm generates all possible subsequences from the first sequence (size N) and the second sequence (size M). Generating all subsequences involves creating temporary lists to store them, and in the worst case, each element is either present or absent, leading to 2^N subsequences for the first input and 2^M for the second. The space required to store these subsequences is proportional to their number, creating a complexity of O(2^N + 2^M). We also need to store intermediate maximum dot products which is O(1) and negligible. Therefore, the total auxiliary space used is O(2^N + 2^M) which can be simplified to O(2^(N+M)) since the subsequences are generated and stored.

Optimal Solution

Approach

We want to find the largest possible score from multiplying matching parts of two sequences. The clever approach is to build up the solution step by step, remembering the best score we can achieve at each point. This avoids recomputing the same things over and over again.

Here's how the algorithm would work step-by-step:

  1. Imagine building a table where each spot represents the maximum dot product we can achieve by considering only the first few elements of both sequences.
  2. Start filling the table from the top left. For each spot, we have three choices.
  3. First, we can choose to ignore the current element in the first sequence. The best score is then the best score from the cell directly above.
  4. Second, we can choose to ignore the current element in the second sequence. The best score is then the best score from the cell to the left.
  5. Third, we can choose to multiply the current elements of both sequences and add it to the best score we had before considering these elements (the cell diagonally above and to the left).
  6. Pick the best of these three choices and store it in the current spot in the table.
  7. If both current elements are negative, we need to consider the possibility that the optimal subsequence consists of only these two negative numbers. Thus, the score could also be these numbers multiplied. Consider this as yet another choice and take the best of all choices.
  8. Continue filling the table row by row until you reach the bottom right. The value in the bottom right cell is the answer: the maximum dot product of any two subsequences.

Code Implementation

def max_dot_product(first_sequence, second_sequence):
    first_sequence_length = len(first_sequence)
    second_sequence_length = len(second_sequence)

    # Initialize the DP table with zeros.
    dot_product_table = [([0] * (second_sequence_length + 1)) for _ in range(first_sequence_length + 1)]

    for i in range(1, first_sequence_length + 1):
        for j in range(1, second_sequence_length + 1):
            #Consider excluding nums1[i-1] or nums2[j-1].
            exclude_first_sequence = dot_product_table[i - 1][j]

            exclude_second_sequence = dot_product_table[i][j - 1]

            #Consider including both nums1[i-1] and nums2[j-1].
            include_both_sequences = dot_product_table[i - 1][j - 1] + (first_sequence[i - 1] * second_sequence[j - 1])

            #If both elements are negative.
            only_these_elements = first_sequence[i - 1] * second_sequence[j - 1]

            dot_product_table[i][j] = max(exclude_first_sequence, exclude_second_sequence, include_both_sequences, only_these_elements)

    return dot_product_table[first_sequence_length][second_sequence_length]

Big(O) Analysis

Time Complexity
O(m*n)The core of the algorithm involves filling a table of size m x n, where m is the length of the first sequence (nums1) and n is the length of the second sequence (nums2). To calculate each cell in the table, we perform a constant number of operations: comparing three values and potentially multiplying two numbers. Since we need to compute the value for each of the m * n cells, the total number of operations is directly proportional to m * n. Thus, the time complexity is O(m*n).
Space Complexity
O(M*N)The algorithm uses a table (essentially a 2D array) to store intermediate results, where M is the length of the first sequence and N is the length of the second sequence. Each cell in the table stores the maximum dot product achieved so far, requiring constant space per cell. Therefore, the auxiliary space used is proportional to the product of the lengths of the two input sequences, M and N, resulting in O(M*N) space complexity.

Edge Cases

Both input arrays are empty
How to Handle:
Return 0 as the dot product of two empty subsequences is defined as 0.
One array is empty, the other is not
How to Handle:
Return negative infinity because the dot product must be computed using sub-sequences from each array.
Arrays with a single element each, and both elements are negative
How to Handle:
Return the product of the two negative single elements (which will be positive or zero).
Arrays with a single element each, one positive and the other negative
How to Handle:
Return the product of the two elements which would be negative.
Arrays with all positive numbers
How to Handle:
The optimal subsequence will likely be the entire array, so the solution should effectively handle the full array length.
Arrays with all negative numbers
How to Handle:
The optimal subsequence will be the two largest negative numbers in each array, therefore consider single elements.
Integer overflow during multiplication of large numbers
How to Handle:
Use long data type to store intermediate results to avoid overflow when calculating the dot product.
Arrays with a mix of positive, negative, and zero values
How to Handle:
The algorithm must correctly account for zero values and negative values when determining the maximum dot product.