Given an integer array nums
and an integer k
, return the number of good subarrays of nums
.
A good array is an array where the number of different integers in that array is exactly k
.
[1,2,3,1,2]
has 3
different integers: 1
, 2
, and 3
.A subarray is a contiguous part of an array.
Example 1:
Input: nums = [1,2,1,2,3], k = 2 Output: 7 Explanation: Subarrays formed with exactly 2 different integers: [1,2], [2,1], [1,2], [2,3], [1,2,1], [2,1,2], [1,2,1,2]
Example 2:
Input: nums = [1,2,1,3,4], k = 3 Output: 3 Explanation: Subarrays formed with exactly 3 different integers: [1,2,1,3], [2,1,3], [1,3,4].
Constraints:
1 <= nums.length <= 2 * 104
1 <= nums[i], k <= nums.length
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force strategy for this problem is all about checking every possible group of numbers within our list. We'll examine each group to see if it meets our criteria of having exactly K different numbers. If it does, we count it.
Here's how the algorithm would work step-by-step:
def subarrays_with_k_different_integers_brute_force(numbers, k_different):
number_of_valid_subarrays = 0
for start_index in range(len(numbers)):
for end_index in range(start_index, len(numbers)):
subarray = numbers[start_index:end_index + 1]
# Create a set of unique integers
unique_integers = set()
for number in subarray:
unique_integers.add(number)
# Check if the number of unique integers is equal to k_different
if len(unique_integers) == k_different:
number_of_valid_subarrays += 1
return number_of_valid_subarrays
The optimal approach cleverly leverages the principle of inclusion-exclusion. It cleverly transforms the problem into finding the difference between the number of subarrays with *at most* K distinct elements and the number of subarrays with *at most* K-1 distinct elements.
Here's how the algorithm would work step-by-step:
def subarraysWithKDistinct(numbers, k):
def atMostKDistinct(numbers, k):
window_start = 0
distinct_count = 0
frequency_map = {}
result = 0
for window_end in range(len(numbers)):
right_number = numbers[window_end]
if right_number not in frequency_map:
frequency_map[right_number] = 0
frequency_map[right_number] += 1
if frequency_map[right_number] == 1:
distinct_count += 1
# Shrink the window if distinct count exceeds k.
while distinct_count > k:
left_number = numbers[window_start]
frequency_map[left_number] -= 1
if frequency_map[left_number] == 0:
distinct_count -= 1
window_start += 1
result += window_end - window_start + 1
return result
# Inclusion-exclusion principle: K - (K-1)
return atMostKDistinct(numbers, k) - atMostKDistinct(numbers, k - 1)
Case | How to Handle |
---|---|
Empty input array (nums is null or has length 0) | Return 0 if the input array is null or empty since no subarrays can be formed. |
k is zero or negative | Return 0 if k is zero or negative because a subarray cannot have a non-positive number of distinct integers. |
k is greater than the length of the input array | Return 0 if k is greater than the length of nums since no subarray can contain more distinct integers than the array's size. |
Array contains only one element and k is 1 | Return 1, which is the number of subarrays if the array has only one element and k is equal to 1. |
Array contains duplicate numbers, and k is 1 | The sliding window logic correctly handles duplicate numbers and ensures the distinct count is accurate when k is 1. |
Array contains all identical numbers and k is greater than 1 | Return 0 because an array with all identical numbers cannot have more than 1 distinct number. |
Large input array with many duplicate values and k is a small number | The sliding window approach with frequency counting efficiently handles large input sizes and duplicate values as it only needs to iterate through the array once. |
Integer overflow in frequency counting or calculating window sizes | Use appropriate data types (e.g., long) for frequency counts and window sizes to avoid integer overflow issues. |