Subarrays Distinct Element Sum of Squares I Interview Question for Amazon

You are given a 0-indexed integer array nums. The distinct count of a subarray of nums is defined as: Let nums[i..j] be a subarray of nums consisting of all the indices from i to j such that 0 <= i <= j < nums.length. Then the number of distinct values in nums[i..j] is called the distinct count of nums[i..j]. Return the sum of the squares of distinct counts of all subarrays of nums. A subarray is a contiguous non-empty sequence of elements within an array.

For example: nums = [1,2,1] Output: 15 Explanation: Six possible subarrays are: [1]: 1 distinct value [2]: 1 distinct value [1]: 1 distinct value [1,2]: 2 distinct values [2,1]: 2 distinct values [1,2,1]: 2 distinct values The sum of the squares of the distinct counts in all subarrays is equal to 1^2 + 1^2 + 1^2 + 2^2 + 2^2 + 2^2 = 15.

nums = [1,1] Output: 3 Explanation: Three possible subarrays are: [1]: 1 distinct value [1]: 1 distinct value [1,1]: 1 distinct value The sum of the squares of the distinct counts in all subarrays is equal to 1^2 + 1^2 + 1^2 = 3.

def sum_of_squares_of_distinct_counts(nums):
    n = len(nums)
    total_sum = 0

    for i in range(n):
        for j in range(i, n):
            sub_array = nums[i:j+1]
            distinct_count = len(set(sub_array))
            total_sum += distinct_count ** 2

    return total_sum

# Example usage:
nums1 = [1, 2, 1]
print(sum_of_squares_of_distinct_counts(nums1))  # Output: 15

nums2 = [1, 1]
print(sum_of_squares_of_distinct_counts(nums2))  # Output: 3

Explanation:

Initialization:
- n is initialized to the length of the input array nums. This will be used for iterating through subarrays.
- total_sum is initialized to 0. This variable will accumulate the sum of the squares of the distinct counts of each subarray.
Outer Loop (Starting Index):
- The outer loop iterates from i = 0 to n - 1. i represents the starting index of the subarray.
Inner Loop (Ending Index):
- The inner loop iterates from j = i to n - 1. j represents the ending index of the subarray.
Subarray Extraction:
- sub_array = nums[i:j+1] extracts the subarray starting from index i and ending at index j (inclusive).
Distinct Count Calculation:
- distinct_count = len(set(sub_array)) calculates the number of distinct elements in the sub_array. By converting the subarray to a set, duplicate elements are automatically removed, and len() gives the count of the remaining unique elements.
Sum Update:
- total_sum += distinct_count ** 2 updates the total_sum by adding the square of the distinct_count.
Return Value:
- After iterating through all possible subarrays, the function returns the total_sum, which is the sum of the squares of the distinct counts of all subarrays.

Big(O) Runtime Analysis:

The runtime complexity is O(n^3). Here's why:

Outer Loop: The outer loop runs n times.
Inner Loop: The inner loop, nested inside the outer loop, also runs approximately n times on average.
Subarray Extraction: nums[i:j+1] takes O(n) time in the worst case because it can potentially create a copy of almost the entire input array.
set() operation: Creating a set from the subarray takes O(n) time in the worst case (when all elements in the subarray are distinct).

Therefore, the dominant operations inside the loops (subarray extraction and set creation) both contribute O(n) complexity, resulting in O(n * n * n) = O(n^3) time complexity.

Big(O) Space Usage Analysis:

The space complexity is O(n). Here's why:

set() operation: The set(sub_array) operation creates a new set to store the distinct elements of the subarray. In the worst case (when all elements in the subarray are distinct), the set can grow to the size of the subarray, which is at most n.
Auxiliary Variables: The other variables (n, total_sum, i, j, sub_array, distinct_count) take up constant space, O(1).

Thus, the dominant space usage comes from the set() operation, leading to O(n) space complexity.

Edge Cases:

Empty Input Array:
- If the input array nums is empty, the loops won't execute, and the function will return 0, which is the correct behavior since there are no subarrays.
Array with Single Element:
- If the input array nums has only one element, there will be only one subarray (the array itself). The distinct count will be 1, and the function will return 1.
Array with Duplicate Elements:
- The code handles duplicate elements correctly because the set() operation automatically removes duplicates when calculating the distinct count.
Array with All Same Elements:
- If the input array nums contains all the same elements (e.g., [2, 2, 2]), all subarrays will have a distinct count of 1. The function will correctly calculate the sum of squares of these distinct counts.
Large Input Array:
- For very large input arrays, the O(n^3) time complexity might become a performance bottleneck. Optimization techniques like using a sliding window approach with a hashmap to track distinct counts could be employed to improve performance, although the constraints specify a small input size.