Taro Logo

Number of Sub-arrays of Size K and Average Greater than or Equal to Threshold

Medium
Asked by:
Profile picture
Profile picture
Profile picture
14 views
Topics:
ArraysSliding Windows

Given an array of integers arr and two integers k and threshold, return the number of sub-arrays of size k and average greater than or equal to threshold.

Example 1:

Input: arr = [2,2,2,2,5,5,5,8], k = 3, threshold = 4
Output: 3
Explanation: Sub-arrays [2,5,5],[5,5,5] and [5,5,8] have averages 4, 5 and 6 respectively. All other sub-arrays of size 3 have averages less than 4 (the threshold).

Example 2:

Input: arr = [11,13,17,23,29,31,7,5,2,3], k = 3, threshold = 5
Output: 6
Explanation: The first 6 sub-arrays of size 3 have averages greater than 5. Note that averages are not integers.

Constraints:

  • 1 <= arr.length <= 105
  • 1 <= arr[i] <= 104
  • 1 <= k <= arr.length
  • 0 <= threshold <= 104

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges for the numbers in the input array, and can the numbers be negative, zero, or floating-point values?
  2. What are the constraints on the size of the input array (the number of elements) and the value of k (the sub-array size)?
  3. What are the possible values for the threshold, and can it also be a floating-point number?
  4. If no sub-arrays meet the criteria (average >= threshold), what value should I return?
  5. Is k always a valid value, meaning will it always be less than or equal to the size of the input array?

Brute Force Solution

Approach

The brute force approach to this problem is very straightforward. We're going to look at every possible group of numbers of the correct size and check each one to see if its average is good enough. If it is, we count it.

Here's how the algorithm would work step-by-step:

  1. Start by considering the first group of numbers of the specified size.
  2. Calculate the average of this group.
  3. Check if this average is greater than or equal to the given threshold.
  4. If it is, increase our count of good groups.
  5. Now, move to the next group of numbers, shifting one position to the right.
  6. Repeat the process of calculating the average and checking if it meets the threshold, updating our count accordingly.
  7. Continue this process until we have examined all possible groups of the specified size within the entire set of numbers.
  8. The final count will represent the number of groups that meet the required average threshold.

Code Implementation

def number_of_subarrays(
    numbers, subarray_length, threshold
):
    number_of_subarrays_meeting_criteria = 0
    # Iterate through the array, considering each possible start
    for start_index in range(len(numbers) - subarray_length + 1):

        current_subarray_sum = 0
        # Calculate the sum of the current sub-array.
        for index_in_subarray in range(subarray_length):
            current_subarray_sum += numbers[
                start_index + index_in_subarray
            ]

        # Calculate the average of the current sub-array.
        current_subarray_average =
        current_subarray_sum / subarray_length

        # Check if the average meets the specified threshold.
        if current_subarray_average >= threshold:
            # Increment the result if the criteria is met.
            number_of_subarrays_meeting_criteria += 1

    return number_of_subarrays_meeting_criteria

Big(O) Analysis

Time Complexity
O(n*k)The algorithm iterates through the array of size n, creating sub-arrays of size k. For each sub-array, the average is calculated which requires k operations. The process is repeated for all possible starting positions, resulting in (n-k+1) iterations. Therefore, the time complexity is (n-k+1)*k which simplifies to O(n*k) because k is considered as part of the input and cannot be considered constant.
Space Complexity
O(1)The provided solution only uses a few constant space variables like the loop counter or a variable to store the current average or a count of qualifying subarrays. It does not create any auxiliary data structures whose size depends on the input array's size (N) or the subarray size (K). Therefore, the space complexity remains constant regardless of the input size.

Optimal Solution

Approach

The goal is to efficiently count segments of a specific length within a set of numbers that meet a certain average requirement. The core idea is to avoid recomputing the sum of each segment from scratch, instead cleverly reusing previous calculations.

Here's how the algorithm would work step-by-step:

  1. First, calculate the sum of the very first segment of the required length.
  2. Check if the average of this initial segment meets the threshold requirement and if so, increment the counter.
  3. Then, to move to the next potential segment, subtract the first number of the previous segment from the sum, and add the next number in the entire set.
  4. With this new sum, we can easily calculate the average of this new segment and compare it to the threshold, incrementing the counter again if needed.
  5. Repeat the process of subtracting the leftmost number and adding the next number to efficiently update the sum as you slide through the set.
  6. Continue until you've considered all possible segments of the specified length.

Code Implementation

def number_of_subarrays(input_array, k_length, threshold):
    number_of_subarrays_result = 0
    current_sum = 0

    # Calculate the sum of the first k_length elements
    for i in range(k_length):
        current_sum += input_array[i]

    # Check if the average of the first subarray meets the threshold
    if current_sum / k_length >= threshold:
        number_of_subarrays_result += 1

    # Iterate through the rest of the array
    for i in range(k_length, len(input_array)): 

        # Efficiently update the sum by subtracting the leftmost and adding the next element
        current_sum -= input_array[i - k_length]
        current_sum += input_array[i]

        # Check if the average of the current subarray meets the threshold
        if current_sum / k_length >= threshold:
            number_of_subarrays_result += 1

    return number_of_subarrays_result

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the array of size n once to calculate the initial sum of the first sub-array of size k. Then, it slides the window of size k across the remaining elements of the array. Inside the loop, a constant number of operations (subtraction and addition) are performed for each element after the initial k elements. Therefore, the time complexity is directly proportional to the input size n, resulting in O(n).
Space Complexity
O(1)The algorithm calculates a running sum and maintains a count. It uses a single variable to store the current sum of the sub-array and another to count the number of sub-arrays that meet the criteria. No additional data structures that scale with the input size N (the number of elements in the input array) are used. Therefore, the auxiliary space required is constant, independent of N.

Edge Cases

Empty input array
How to Handle:
Return 0, since no subarrays can be formed from an empty array.
K is zero or negative
How to Handle:
Return 0, since a subarray of non-positive size is impossible.
K is larger than the size of the input array
How to Handle:
Return 0, since no subarrays of size K can be formed if K exceeds the array's size.
Input array contains negative numbers
How to Handle:
The algorithm should correctly handle negative numbers in the average calculation.
Input array contains very large numbers that could cause overflow during sum calculation
How to Handle:
Use a data type with sufficient range (e.g., long) to store the sum and consider dividing by K first to minimize overflow.
Threshold is very large or very small (close to infinity or negative infinity)
How to Handle:
The comparison `average >= threshold` should still hold true for extreme threshold values.
All elements are identical and less than the threshold
How to Handle:
Return 0 if the identical value is less than the threshold, as no subarray's average will reach the threshold.
K is equal to the length of the array
How to Handle:
The average of the entire array should be compared to the threshold.