Taro Logo

Maximum Length of Repeated Subarray

Medium
Citadel logo
Citadel
0 views
Topics:
ArraysDynamic Programming

Given two integer arrays nums1 and nums2, return the maximum length of a subarray that appears in both arrays.

Example 1:

Input: nums1 = [1,2,3,2,1], nums2 = [3,2,1,4,7]
Output: 3
Explanation: The repeated subarray with maximum length is [3,2,1].

Example 2:

Input: nums1 = [0,0,0,0,0], nums2 = [0,0,0,0,0]
Output: 5
Explanation: The repeated subarray with maximum length is [0,0,0,0,0].

Constraints:

  • 1 <= nums1.length, nums2.length <= 1000
  • 0 <= nums1[i], nums2[i] <= 100

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges for the values within the input arrays A and B? Can they be negative?
  2. What is the maximum possible length of the input arrays A and B? Should I be concerned about memory usage for very large arrays?
  3. If no common subarray exists between A and B, what value should I return?
  4. If there are multiple subarrays with the same maximum length, is it sufficient to return just one of the maximum lengths, or do you require all such lengths?
  5. Are the input arrays guaranteed to be non-null and non-empty?

Brute Force Solution

Approach

The brute force strategy for finding the maximum length of a repeated subarray involves checking all possible subarrays in the two given number sequences. We will look at all possible starting positions and lengths for subarrays within the first number sequence and then check if these are present in the second number sequence. Finally, we pick the longest common subarray we found.

Here's how the algorithm would work step-by-step:

  1. Start by considering all possible subarrays from the first number sequence. A subarray is a continuous section of the sequence.
  2. For each of these subarrays, search within the second number sequence to see if an identical subarray exists.
  3. If you find a matching subarray in the second sequence, take note of its length.
  4. Keep track of the length of the longest matching subarray found so far.
  5. After checking all possible subarrays from the first sequence, the recorded length will be the maximum length of the repeated subarray.

Code Implementation

def find_maximum_length_of_repeated_subarray_brute_force(first_number_sequence, second_number_sequence):
    maximum_length = 0

    for first_sequence_start_index in range(len(first_number_sequence)):
        for first_sequence_end_index in range(first_sequence_start_index, len(first_number_sequence)):
            # Extract subarray from the first number sequence.

            first_sequence_subarray = first_number_sequence[first_sequence_start_index:first_sequence_end_index + 1]
            first_sequence_subarray_length = len(first_sequence_subarray)

            for second_sequence_start_index in range(len(second_number_sequence)):
                for second_sequence_end_index in range(second_sequence_start_index, len(second_number_sequence)):

                    second_sequence_subarray = second_number_sequence[second_sequence_start_index:second_sequence_end_index + 1]
                    
                    # Check for a matching subarray.

                    if first_sequence_subarray == second_sequence_subarray:
                        # We update the maximum length here.

                        if first_sequence_subarray_length > maximum_length:
                            maximum_length = first_sequence_subarray_length

    return maximum_length

Big(O) Analysis

Time Complexity
O(n^3)The outer loops iterate through all possible starting positions and lengths of subarrays in the first array of size n, resulting in approximately n^2 subarrays. For each of these n^2 subarrays, we search for a matching subarray in the second array, which takes O(n) time in the worst case (e.g., by comparing each subarray to the second array). Therefore, the overall time complexity is O(n^2 * n), which simplifies to O(n^3).
Space Complexity
O(1)The provided brute force algorithm iterates through all possible subarrays of the first number sequence and compares them to the second sequence, keeping track of the longest matching length found so far. It only requires a constant amount of extra space to store variables like the starting positions and lengths of the subarrays being compared, and the length of the longest matching subarray found. These variables consume a fixed amount of memory irrespective of the size of the input number sequences. Therefore, the space complexity is O(1).

Optimal Solution

Approach

The best way to find the longest matching piece between two sequences is to build up a table of matches piece by piece. Each spot in the table represents the length of the longest common piece ending at those specific positions in the sequences. We can use this table to find the overall longest common piece.

Here's how the algorithm would work step-by-step:

  1. Imagine building a grid where each row represents a position in the first sequence and each column represents a position in the second sequence.
  2. Go through the grid one cell at a time, starting from the top left.
  3. For each cell, check if the elements at the corresponding positions in the two sequences match.
  4. If they match, the value of that cell is one more than the value of the cell diagonally above and to the left (because we're extending a common piece). If they don't match, the value is zero (because the common piece is broken).
  5. As you fill in the grid, keep track of the largest value you've seen so far. This value represents the length of the longest common piece found.
  6. Once the entire grid is filled in, the largest value you tracked is the length of the longest common piece between the two sequences.

Code Implementation

def find_length(first_array, second_array):
    first_array_length = len(first_array)
    second_array_length = len(second_array)

    # Initialize the dynamic programming table
    dp_table = [[0] * (second_array_length + 1) for _ in range(first_array_length + 1)]
    maximum_length = 0

    for first_index in range(1, first_array_length + 1):
        for second_index in range(1, second_array_length + 1):
            # If the elements match, extend the length of the common subarray
            if first_array[first_index - 1] == second_array[second_index - 1]:

                dp_table[first_index][second_index] = dp_table[first_index - 1][second_index - 1] + 1

                #Keep track of longest length found so far
                maximum_length = max(maximum_length, dp_table[first_index][second_index])

            #If elements do not match, reset the common subarray length to 0
            else:
                dp_table[first_index][second_index] = 0

    # Return the maximum length of repeated subarray
    return maximum_length

Big(O) Analysis

Time Complexity
O(m*n)The algorithm involves constructing a grid (conceptually a matrix) where the dimensions are determined by the lengths of the two input arrays. Let 'm' be the length of the first array and 'n' be the length of the second array. Filling each cell of the grid requires a constant amount of work: comparing two elements and, potentially, referencing a neighboring cell. Therefore, we perform a constant time operation for each of the m * n cells in the grid. Thus, the overall time complexity is O(m*n).
Space Complexity
O(m*n)The algorithm constructs a grid (represented as a 2D array) where 'm' is the length of the first sequence and 'n' is the length of the second sequence. Each cell in the grid stores an integer representing the length of the longest common subarray ending at that corresponding position. Therefore, the auxiliary space required is proportional to the product of the lengths of the two input sequences, resulting in O(m*n) space complexity.

Edge Cases

CaseHow to Handle
One or both input arrays are null or empty.Return 0 if either array is null or has a length of 0 since there's no possible subarray.
Arrays with only one element each.Compare the single elements and return 1 if they are equal, 0 otherwise.
Arrays of significantly different sizes.The dynamic programming approach should handle size differences efficiently, iterating through each cell of the DP table.
Arrays with all identical values (e.g., [1, 1, 1, 1] and [1, 1, 1]).The DP approach correctly identifies the length of the repeating subarray, which will be the length of the shorter array in this case.
Arrays with no common elements.The DP table will remain filled with 0s, and the function should return 0.
Integer overflow when calculating the length of the subarray in extremely large arrays.Ensure the data type used to store the maximum length is large enough (e.g., long) to prevent overflow.
Arrays with negative numbers.The dynamic programming approach works correctly with negative numbers as array values.
Very large arrays (close to memory limits).The O(m*n) space complexity of the DP solution might lead to memory issues; consider optimizing space complexity if necessary using rolling array technique.