Taro Logo

Count the Number of Good Partitions

Hard
Google logo
Google
4 views
Topics:
ArraysDynamic Programming

You are given a 0-indexed array nums consisting of positive integers.

A partition of an array into one or more contiguous subarrays is called good if no two subarrays contain the same number.

Return the total number of good partitions of nums.

Since the answer may be large, return it modulo 10^9 + 7.

For example:

  1. nums = [1,2,3,4] Output: 8 Explanation: The 8 possible good partitions are: ([1], [2], [3], [4]), ([1], [2], [3,4]), ([1], [2,3], [4]), ([1], [2,3,4]), ([1,2], [3], [4]), ([1,2], [3,4]), ([1,2,3], [4]), and ([1,2,3,4]).

  2. nums = [1,1,1,1] Output: 1 Explanation: The only possible good partition is: ([1,1,1,1]).

  3. nums = [1,2,1,3] Output: 2 Explanation: The 2 possible good partitions are: ([1,2,1], [3]) and ([1,2,1,3]).

Constraints:

  • 1 <= nums.length <= 10^5
  • 1 <= nums[i] <= 10^9

Explain how you would efficiently determine the total number of "good" partitions modulo 10^9 + 7.

Solution


Brute Force Approach

A brute force approach would involve generating all possible partitions of the array and then checking if each partition is "good". A partition is good if no two subarrays within it have the same sum. This approach is highly inefficient due to the exponential number of possible partitions.

def is_good_partition_brute_force(nums):
    def get_partitions(arr):
        if not arr:
            return [[]]
        
        first = arr[0]
        rest = arr[1:]
        
        partitions = []
        for i in range(1, len(arr) + 1):
            first_subarray = arr[:i]
            remaining_arr = arr[i:]
            for sub_partition in get_partitions(remaining_arr):
                partitions.append([first_subarray] + sub_partition)
        return partitions

    def check_good(partition):
        subarray_lengths = [len(sub_array) for sub_array in partition]
        return len(subarray_lengths) == len(set(subarray_lengths))

    all_partitions = get_partitions(nums)
    count = 0
    for partition in all_partitions:
        if check_good(partition):
            count += 1
    return count

Time Complexity: O(2n * n), where n is the length of the input array, because there are exponentially many partitions and checking if a partition is good takes O(n) time.

Space Complexity: O(n), primarily due to the recursion depth and storage of subarrays.

Optimal Approach

The problem asks for the number of "good" partitions, where a partition is considered "good" if no two subarrays have the same length. We can dynamically determine the number of good partitions. Let dp[i] represent the number of good partitions ending at index i. Initialize dp[0] = 1 and iterate through the array. For each index i, we check previous indices j to see if the subarray from j+1 to i creates a valid partition. Maintain a last_occurrence map to store the last index at which each number appeared. Using this information, we can efficiently compute the number of good partitions.

def solve():
    def goodPartitions(nums):
        n = len(nums)
        last = {}
        for i, num in enumerate(nums):
            last[num] = i

        count = 0
        max_reach = 0
        for i in range(n):
            max_reach = max(max_reach, last[nums[i]])
            if max_reach == i:
                count += 1
        
        if count == 0:
            return 1

        ans = pow(2, count - 1, 10**9 + 7)
        return ans
    
    return goodPartitions

nums1 = [1,2,3,4]
nums2 = [1,1,1,1]
nums3 = [1,2,1,3]

solution = solve()

print(f"Good partitions for {nums1}: {solution(nums1)}")
print(f"Good partitions for {nums2}: {solution(nums2)}")
print(f"Good partitions for {nums3}: {solution(nums3)}")

Explanation:

The key insight to solving this problem efficiently is to understand that we don't actually need to generate or check all possible partitions directly. We can determine the number of good partitions by finding the number of disjoint segments the array can be divided into, where the last occurrence of any element within a segment is within that segment itself.

The last dictionary is used to store the last index where each number appears in the array.

max_reach keeps track of the furthest index we can reach based on the last occurrences of elements we've seen so far.

If max_reach is equal to i, it means that all elements from the beginning of the current segment (implicitly defined) to index i have their last occurrences within this segment. Thus, we can make a cut (partition) at this index. The number of such cuts determines how many independent segments we have.

If the array can be divided into k such segments, the number of good partitions is 2**(k-1). This is because each of the k-1 boundaries between the segments can either be a partition or not, giving us 2 choices for each boundary.

Edge Cases:

  • If count is 0, it means max_reach never equals i. Thus, the whole array represents a single segment, and there is only 1 good partition (the array itself).
  • Modulo operation is used to keep the result within the constraints.

Time Complexity: O(n), where n is the length of the input array. This is because we iterate through the array once to populate the last dictionary and once more to determine the number of segments.

Space Complexity: O(n) in the worst case, where n is the number of unique elements in the input array. This space is used by the last dictionary to store the last occurrence of each unique element.