Taro Logo

Length of Longest Subarray With at Most K Frequency

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+3
More companies
Profile picture
Profile picture
Profile picture
59 views
Topics:
ArraysSliding WindowsTwo Pointers

You are given an integer array nums and an integer k.

The frequency of an element x is the number of times it occurs in an array.

An array is called good if the frequency of each element in this array is less than or equal to k.

Return the length of the longest good subarray of nums.

A subarray is a contiguous non-empty sequence of elements within an array.

Example 1:

Input: nums = [1,2,3,1,2,3,1,2], k = 2
Output: 6
Explanation: The longest possible good subarray is [1,2,3,1,2,3] since the values 1, 2, and 3 occur at most twice in this subarray. Note that the subarrays [2,3,1,2,3,1] and [3,1,2,3,1,2] are also good.
It can be shown that there are no good subarrays with length more than 6.

Example 2:

Input: nums = [1,2,1,2,1,2,1,2], k = 1
Output: 2
Explanation: The longest possible good subarray is [1,2] since the values 1 and 2 occur at most once in this subarray. Note that the subarray [2,1] is also good.
It can be shown that there are no good subarrays with length more than 2.

Example 3:

Input: nums = [5,5,5,5,5,5,5], k = 4
Output: 4
Explanation: The longest possible good subarray is [5,5,5,5] since the value 5 occurs 4 times in this subarray.
It can be shown that there are no good subarrays with length more than 4.

Constraints:

  • 1 <= nums.length <= 105
  • 1 <= nums[i] <= 109
  • 1 <= k <= nums.length

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges for the values in the input array, and can the array contain negative numbers or zeros?
  2. What is the expected behavior if the input array is null or empty? Should I return 0 in those cases?
  3. If there are multiple subarrays of the same maximum length that satisfy the condition, should I return any one of them, or is there a specific one I should return (e.g., the first one encountered)?
  4. What are the constraints on the value of 'K'? Can it be negative, zero, or larger than the size of the array?
  5. Are the array elements integers, and if so, what is the range of possible integer values to consider potential integer overflow issues?

Brute Force Solution

Approach

To find the longest section where no value appears too many times, the brute force method looks at every possible section. It checks each section to see if it meets the criteria, keeping track of the longest one that does.

Here's how the algorithm would work step-by-step:

  1. Consider every possible starting point for a section of the input.
  2. For each starting point, consider every possible ending point (that comes after the start). This gives us all possible sections.
  3. For each section, count how many times each value appears.
  4. Check if any value appears more than the allowed number of times. If so, this section is not valid.
  5. If the section is valid (no value appears too many times), compare its length to the length of the longest valid section we've found so far.
  6. If the current section is longer, remember it as the new longest valid section.
  7. After checking all possible sections, the section we remembered as the longest is the answer.

Code Implementation

def find_longest_subarray_brute_force(numbers, max_frequency):

    longest_subarray_length = 0

    for start_index in range(len(numbers)):
        for end_index in range(start_index, len(numbers)):
            subarray = numbers[start_index:end_index + 1]

            # Count occurrences of each number in the current subarray
            number_counts = {}

            for number in subarray:
                if number in number_counts:
                    number_counts[number] += 1
                else:
                    number_counts[number] = 1

            is_valid = True

            # Check if any number appears more than max_frequency times
            for number in number_counts:
                if number_counts[number] > max_frequency:
                    is_valid = False
                    break

            # Update longest subarray length if needed
            if is_valid:
                current_subarray_length = len(subarray)

                # Keep track of the longest valid subarray
                if current_subarray_length > longest_subarray_length:

                    longest_subarray_length = current_subarray_length

    return longest_subarray_length

Big(O) Analysis

Time Complexity
O(n³)The brute force approach iterates through all possible subarrays. There are O(n²) possible subarrays, determined by choosing a start and end index within the array of size n. For each subarray, we count the frequency of each element, which takes O(n) time in the worst case since we might iterate through all elements of the subarray. Therefore, the total time complexity is O(n² * n) = O(n³).
Space Complexity
O(N)The brute force approach, as described, necessitates counting the frequency of each value within every possible subarray. This requires a data structure, typically a hash map or an array, to store these counts. In the worst-case scenario, the subarray can be nearly the size of the input array, so the frequency map's size scales linearly with the input size N, where N is the length of the input array. Thus, the auxiliary space required is proportional to N, and the space complexity is O(N).

Optimal Solution

Approach

The goal is to find the longest continuous section where no value appears too many times. The clever trick is to use a 'sliding window' approach: think of it like looking at the data through a rectangle that expands or contracts as needed.

Here's how the algorithm would work step-by-step:

  1. Imagine you have a movable window that initially covers just the first value.
  2. Expand the window to include the next value. Count how many times each value appears inside the window.
  3. Check if any value's count exceeds the allowed limit. If everything is okay, keep expanding the window.
  4. If some value's count is too high, shrink the window from the beginning until all values are within the limit.
  5. Keep track of the longest window you find that meets the requirements. This approach ensures you examine all possible continuous sections without repeating calculations unnecessarily.

Code Implementation

def longest_subarray_with_k_frequency(numbers, max_frequency):
    window_start = 0
    max_length = 0
    frequency_map = {}

    for window_end in range(len(numbers)):
        right_number = numbers[window_end]
        if right_number not in frequency_map:
            frequency_map[right_number] = 0
        frequency_map[right_number] += 1

        # Shrink the window if any element exceeds max_frequency
        while any(count > max_frequency for count in frequency_map.values()):
            left_number = numbers[window_start]

            # Decrement the frequency of the element leaving the window
            frequency_map[left_number] -= 1
            window_start += 1

            # Remove the element if its frequency becomes 0
            if frequency_map[left_number] == 0:
                del frequency_map[left_number]

        # Update the maximum length of the valid subarray
        max_length = max(max_length, window_end - window_start + 1)

    return max_length

Big(O) Analysis

Time Complexity
O(n)The algorithm uses a sliding window approach. The outer loop effectively iterates through the array once by expanding the window (right pointer). The inner loop, shrinking the window (left pointer), may iterate multiple times for a single expansion of the outer loop, but the key is that the left pointer can only move from 0 to n-1 across the *entire* process, making the number of shrinks across all iterations bounded by O(n). Since both the expansion and shrinking operations are at most proportional to n, the overall time complexity is O(n).
Space Complexity
O(N)The algorithm uses a sliding window approach, maintaining a count of each value's frequency within the current window. This requires a hash map (or dictionary) to store these frequencies. In the worst case, all N elements in the input array could be distinct, leading to a hash map storing N key-value pairs. Therefore, the auxiliary space used is proportional to N, making the space complexity O(N).

Edge Cases

Empty array input
How to Handle:
Return 0, as there's no subarray.
k = 0 and array has repeated elements
How to Handle:
The longest subarray will be of length 1, return 1.
Array with all identical elements and k is less than the frequency of those elements
How to Handle:
Return the length of the longest subarray with frequency limited by k, which equals k.
Array with all identical elements and k is greater than or equal to the frequency of those elements
How to Handle:
Return the length of the array, as all elements satisfy the condition.
Array with a single element.
How to Handle:
Return 1, as the subarray consisting of only that element is the longest.
Large array with k being a large number
How to Handle:
Solution should efficiently track frequencies, avoiding brute-force recalculations with each window slide.
Array with negative numbers.
How to Handle:
Hash map used to store element frequencies should correctly handle negative numbers as keys.
k is larger than array length
How to Handle:
Return the length of the array, as all elements can be included.