Taro Logo

Moving Average from Data Stream

Easy
Nuro logo
Nuro
1 view
Topics:
ArraysSliding Windows

Given a stream of integers and a window size, calculate the moving average of all integers in the sliding window.

Implement the MovingAverage class:

  • MovingAverage(int size) Initializes the object with the size of the window size.
  • double next(int val) Returns the moving average of the last size values of the stream.

Example 1:

Input
["MovingAverage", "next", "next", "next", "next"]
[[3], [1], [10], [3], [5]]
Output
[null, 1.0, 5.5, 4.66667, 6.0]

Explanation
MovingAverage movingAverage = new MovingAverage(3);
movingAverage.next(1); // return 1.0 = 1 / 1
movingAverage.next(10); // return 5.5 = (1 + 10) / 2
movingAverage.next(3); // return 4.66667 = (1 + 10 + 3) / 3
movingAverage.next(5); // return 6.0 = (10 + 3 + 5) / 3

Constraints:

  • 1 <= size <= 1000
  • -105 <= val <= 105
  • At most 104 calls will be made to next.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected data type for the stream of integers, and can it include negative numbers, zeros, or floating-point numbers?
  2. What is the maximum size of the window, and what should the class return if the number of added data points is less than the window size?
  3. Can I assume that the window size will always be a positive integer?
  4. How should I handle potential overflow if the sum of the integers in the window becomes very large?
  5. Is there a maximum number of calls expected to the 'next' function, and what is the memory constraint I should consider?

Brute Force Solution

Approach

We need to calculate the average of a series of numbers as they arrive, but only for a specific number of recent entries. The brute force method keeps track of all the numbers seen so far and recalculates the average for the required recent values every time a new number comes in.

Here's how the algorithm would work step-by-step:

  1. When a new number appears, store it in a list of all numbers received so far.
  2. Keep adding numbers to the list as they arrive.
  3. When asked to calculate the average, look at the end of the list and consider only the specified number of most recent entries.
  4. Add up these recent numbers.
  5. Divide the sum by the number of recent entries that were summed, to obtain the average.
  6. Return this average as the result.

Code Implementation

class MovingAverage:
    def __init__(self, size):
        self.size = size
        self.all_numbers = []

    def next(self, value):
        self.all_numbers.append(value)

        # Ensure we only consider the 'size' most recent numbers
        window_size = min(self.size, len(self.all_numbers))

        # Calculate the sum of the recent numbers
        recent_numbers_sum = sum(self.all_numbers[-window_size:])

        # Calculate the average
        average = recent_numbers_sum / window_size
        return average

Big(O) Analysis

Time Complexity
O(n)The time complexity is determined by the `next` function, which is called for each incoming data point. While the list grows linearly with the number of elements (n) received, the `next` function only iterates through the last 'size' elements of the list to compute the average. Because 'size' is a fixed constant (the moving average window size), the summation step takes constant time, O(size), which simplifies to O(1). Therefore, the overall time complexity for processing n elements is O(n * 1) or O(n).
Space Complexity
O(N)The described algorithm stores all numbers received so far in a list. As new numbers arrive, they are appended to this list, causing it to grow linearly with the number of inputs. Thus, the auxiliary space required grows proportionally to N, where N is the total number of numbers received. Therefore, the space complexity is O(N).

Optimal Solution

Approach

The most efficient way to calculate the moving average is to keep a running sum of the numbers. Instead of recalculating the sum for each new average, we only update it by adding the new number and subtracting the oldest number.

Here's how the algorithm would work step-by-step:

  1. Keep track of the sum of the numbers we've seen so far.
  2. When a new number comes in, add it to the sum.
  3. If we have more numbers than the window size (the number of numbers we want to average), subtract the oldest number from the sum.
  4. Divide the current sum by the smaller value between the number of numbers we've seen, and the window size. This gives us the current moving average.
  5. Store the new number in a place where we can easily remember it when it becomes the oldest.

Code Implementation

class MovingAverage:

    def __init__(self, size: int):
        self.window_size = size
        self.numbers = []
        self.current_sum = 0

    def next(self, value: int) -> float:
        self.numbers.append(value)
        self.current_sum += value

        # If we exceed the window size, remove the oldest value.
        if len(self.numbers) > self.window_size:
            self.current_sum -= self.numbers.pop(0)

        # Take care of edge cases when calculating average.
        current_length = min(len(self.numbers), self.window_size)

        return self.current_sum / current_length

# Your MovingAverage object will be instantiated and called as such:
# movingAverage = MovingAverage(size)
# answer = movingAverage.next(value)

Big(O) Analysis

Time Complexity
O(1)The addNum function performs a fixed number of operations regardless of the size of the input stream. It adds the new value to the running sum, potentially removes the oldest element from the sum, and updates the queue. These are all constant-time operations. Therefore, the time complexity of the addNum function, which is the core operation, is O(1).
Space Complexity
O(K)The solution stores the incoming numbers in a data structure (implicitly a list or queue) to remember the oldest number. The size of this data structure is limited by the window size, K. Therefore, the auxiliary space used grows linearly with the window size K, to store these numbers. The space needed to store the sum and the moving average are constant and do not depend on K or the input data stream size.

Edge Cases

CaseHow to Handle
Window size is zero or negativeThrow an IllegalArgumentException, as a non-positive window size is invalid and doesn't make sense for a moving average.
Input stream is emptyReturn 0.0 as the moving average when the stream is empty, as there are no data points to average.
Window size is larger than the number of elements added to the streamCalculate the average using only the elements currently in the stream, effectively using a smaller window.
Integers in the stream are very large, potentially leading to integer overflow during summationUse long or double to store the sum to prevent potential integer overflows.
Adding a very large number of elements such that the data structure storing the elements becomes very large, potentially leading to memory issuesCircular queue implementation allows for constant memory usage regardless of number of calls to next().
Input contains a mix of very large positive and negative values which when summed will result in loss of precision when using floatsUsing double rather than float can mitigate precision loss problems.
Repeated calls to next() with identical values exceeding maximum integer sizes can lead to overflow.Use long or double to store intermediate calculations to prevent overflow, even if input values are integers.
Very small window size (e.g., 1) and rapidly changing input valuesThe moving average will closely reflect the latest value, which is the correct behavior.