Taro Logo

Moving Average from Data Stream

Easy
Amazon logo
Amazon
1 view
Topics:
ArraysSliding WindowsDatabase Problems

Design a data structure that calculates the moving average of a stream of numbers.

Details:

  • Implement the MovingAverage class:
    • MovingAverage(int size) Initializes the object with the size of the window size.
    • double next(int val) Returns the moving average of the last size values of the stream.

Example:

MovingAverage m = new MovingAverage(3);
m.next(1); // return 1.0 = 1 / 1
m.next(10); // return 5.5 = (1 + 10) / 2
m.next(3); // return 4.66667 = (1 + 10 + 3) / 3
m.next(5); // return 6.0 = (10 + 3 + 5) / 3

Follow-up:

  • Can you implement it such that each call to next() takes O(1) time?

Solution


Moving Average from Data Stream

Naive Solution

A simple approach is to store all the numbers in a list and, for each call to next(val), append the number to the list. Calculate the moving average by summing the last size numbers in the list and dividing by the minimum of size and the current number of elements in the list.

Code (Naive)

class MovingAverage:
    def __init__(self, size: int):
        self.size = size
        self.window = []

    def next(self, val: int) -> float:
        self.window.append(val)
        window_size = len(self.window)
        count = min(self.size, window_size)
        sum_values = sum(self.window[window_size - count:])
        return sum_values / count

Big O Analysis (Naive)

  • Time Complexity: O(k) for each call to next, where k is the size of the moving average window. This is due to the sum() operation.
  • Space Complexity: O(n), where n is the number of elements added to the data stream. This is due to storing all numbers in the window list.

Optimal Solution

A more efficient solution uses a queue (or a list used as a queue) to store the elements in the moving average window and a variable to keep track of the current sum of the elements in the window. When a new element comes in, we subtract the oldest element (if the queue is full) from the current sum and add the new element to the sum. We also update the queue by adding the new element and removing the oldest element if the queue is full.

Code (Optimal)

from collections import deque

class MovingAverage:
    def __init__(self, size: int):
        self.size = size
        self.window = deque()
        self.sum = 0

    def next(self, val: int) -> float:
        self.sum += val
        self.window.append(val)

        if len(self.window) > self.size:
            self.sum -= self.window.popleft()

        return self.sum / len(self.window)

Big O Analysis (Optimal)

  • Time Complexity: O(1) for each call to next. Adding and removing elements from a deque takes constant time.
  • Space Complexity: O(k), where k is the size of the moving average window, as we only store at most size elements in the queue.

Edge Cases

  • Empty Stream: The first size calls to next will return the average of the elements seen so far, which might be less than the full window size.
  • Zero Size: If the size is zero, the moving average is undefined. The problem statement should specify what happens in this case (e.g., return 0, throw an exception).
  • Negative Values: The algorithm should handle negative values correctly.
  • Large Input Stream: Make sure the data types used to store the sum and the values can handle a potentially large number of inputs to avoid overflow issues. It might be wise to use float to store the sum to prevent this, or use a larger data type.