Taro Logo

Cache With Time Limit

Medium
Netflix logo
Netflix
9 views

Write a class that allows getting and setting key-value pairs, however a time until expiration is associated with each key.

The class has three public methods:

set(key, value, duration): accepts an integer key, an integer value, and a duration in milliseconds. Once the duration has elapsed, the key should be inaccessible. The method should return true if the same un-expired key already exists and false otherwise. Both the value and duration should be overwritten if the key already exists.

get(key): if an un-expired key exists, it should return the associated value. Otherwise it should return -1.

count(): returns the count of un-expired keys.

Example 1:

Input: 
actions = ["TimeLimitedCache", "set", "get", "count", "get"]
values = [[], [1, 42, 100], [1], [], [1]]
timeDelays = [0, 0, 50, 50, 150]
Output: [null, false, 42, 1, -1]
Explanation:
At t=0, the cache is constructed.
At t=0, a key-value pair (1: 42) is added with a time limit of 100ms. The value doesn't exist so false is returned.
At t=50, key=1 is requested and the value of 42 is returned.
At t=50, count() is called and there is one active key in the cache.
At t=100, key=1 expires.
At t=150, get(1) is called but -1 is returned because the cache is empty.

Example 2:

Input: 
actions = ["TimeLimitedCache", "set", "set", "get", "get", "get", "count"]
values = [[], [1, 42, 50], [1, 50, 100], [1], [1], [1], []]
timeDelays = [0, 0, 40, 50, 120, 200, 250]
Output: [null, false, true, 50, 50, -1, 0]
Explanation:
At t=0, the cache is constructed.
At t=0, a key-value pair (1: 42) is added with a time limit of 50ms. The value doesn't exist so false is returned.
At t=40, a key-value pair (1: 50) is added with a time limit of 100ms. A non-expired value already existed so true is returned and the old value was overwritten.
At t=50, get(1) is called which returned 50.
At t=120, get(1) is called which returned 50.
At t=140, key=1 expires.
At t=200, get(1) is called but the cache is empty so -1 is returned.
At t=250, count() returns 0 because the cache is empty.

Constraints:

  • 0 <= key, value <= 109
  • 0 <= duration <= 1000
  • 1 <= actions.length <= 100
  • actions.length === values.length
  • actions.length === timeDelays.length
  • 0 <= timeDelays[i] <= 1450
  • actions[i] is one of "TimeLimitedCache", "set", "get" and "count"
  • First action is always "TimeLimitedCache" and must be executed immediately, with a 0-millisecond delay

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible data types for the `key` and `value`? Can `key` be null or an empty string?
  2. Can the `duration` be zero or negative? If zero, should the value be considered immediately expired?
  3. What should happen if the system time changes during the operation of the cache (e.g., clock is adjusted backwards)?
  4. Are there any memory constraints I should be aware of? How many key-value pairs should the cache be able to hold?
  5. Is thread safety a requirement? Will the cache be accessed concurrently from multiple threads?

Brute Force Solution

Approach

The goal is to create a system that stores information for a limited time. The brute force approach to this problem involves directly implementing the storing and retrieving functionality, without any clever optimizations.

Here's how the algorithm would work step-by-step:

  1. When you need to save some information (a key and its value), simply store the information along with the exact moment it was saved.
  2. When you need to retrieve information, first check if the time that has passed since you saved the information is longer than the allowed time limit.
  3. If the time limit has passed, the information is considered expired and is no longer valid. Return nothing.
  4. If the time limit has not passed, the information is still valid. Return the information.
  5. When you need to update information, replace the old information with the new information and record the current time again.

Code Implementation

import time

class TimeLimitedCache:
    def __init__(self):
        self.cache = {}

    def set(self, key, value, duration):
        # Store the key-value pair with its expiration time
        self.cache[key] = {'value': value, 'expiration_time': time.time() + duration}

    def get(self, key):
        # Check if the key exists in the cache
        if key not in self.cache:
            return -1

        item = self.cache[key]
        # Check if the item has expired
        if time.time() > item['expiration_time']:
            del self.cache[key]

            return -1

        return item['value']

    def count(self):
        # Remove expired entries before counting
        current_time = time.time()
        keys_to_delete = []
        for key, item in self.cache.items():

            if current_time > item['expiration_time']:
                keys_to_delete.append(key)

        for key in keys_to_delete:

            del self.cache[key]

        return len(self.cache)

Big(O) Analysis

Time Complexity
O(1)The cache operations (set, get, count) all involve direct access to a dictionary or hash map. Setting a key-value pair, retrieving a value by its key, and counting the number of elements all take constant time, regardless of the number of items stored. The time limit check involves a simple subtraction and comparison, which is also a constant-time operation. Therefore, each method executes in O(1) time.
Space Complexity
O(N)The described cache implementation stores each key-value pair along with its timestamp. In the worst-case scenario, we might store N such pairs, where N is the total number of put operations performed. This implies that the auxiliary space required would scale linearly with the number of cached items. Therefore, the auxiliary space complexity is O(N).

Optimal Solution

Approach

We will create a system that remembers data but only for a certain amount of time. The key is to store each piece of data along with when it expires so that we can easily check if it's still valid when someone asks for it.

Here's how the algorithm would work step-by-step:

  1. When something is added to the system, store the data and the time it should expire.
  2. When someone asks for some data, first check if it exists in our system.
  3. If it exists, then look at the expiration time we stored. If the current time is later than the expiration time, the data is too old, so we remove it from our system and say it's not there.
  4. If the expiration time hasn't passed yet, the data is still valid. Return the data to the person who asked.
  5. When setting new data, also remove the old data so it's not stored twice.

Code Implementation

import time

class TimeLimitedCache:

    def __init__(self):
        self.cache = {}

    def set(self, key, value, duration):
        # Remove existing key to avoid duplicates.
        if key in self.cache:
            del self.cache[key]

        self.cache[key] = {'value': value, 'expiration': time.time() + duration}

    def get(self, key):
        if key not in self.cache:
            return -1

        if self.cache[key]['expiration'] <= time.time():
            # Delete expired entry from cache.
            del self.cache[key]
            return -1

        # Return valid value.
        return self.cache[key]['value']

    def count(self):
        # Remove expired entries before counting.
        keys_to_delete = []
        for key, value in self.cache.items():
            if value['expiration'] <= time.time():
                keys_to_delete.append(key)

        for key in keys_to_delete:
            del self.cache[key]
        # Return current amount of valid, unexpired keys.
        return len(self.cache)

Big(O) Analysis

Time Complexity
O(1)The operations performed include accessing and updating a dictionary or hash map, as well as retrieving the current time. All of these operations (setting, getting, and deleting key-value pairs) are typically constant time operations, O(1), on average, assuming a good hash function and no collisions. The expiration time check is also a constant time operation. Thus, the overall time complexity for each operation (set, get) is O(1).
Space Complexity
O(N)The system stores data along with its expiration time. In the worst-case scenario, the system might store N key-value pairs, where N is the number of unique data items added to the cache. Therefore, the auxiliary space required grows linearly with the number of cached items. This is due to a data structure (likely a hash map or dictionary) that holds the cached data and their respective expiration timestamps.

Edge Cases

CaseHow to Handle
Null keyRaise an exception or return an error code, as null keys are generally invalid in hash maps.
Negative durationTreat it as an invalid input by raising an exception or interpreting it as zero duration, effectively expiring the value immediately.
Zero durationThe value should be stored but considered immediately expired upon setting.
Very large duration that can cause integer overflowUse a data type that can accommodate large values (e.g., long in Java or Python) or limit the maximum duration accepted.
Setting the same key multiple times with different durations before expirationThe most recent set operation should always overwrite the previous value and duration.
Calling get on a key that exists but has just expired (right at the time boundary)The get function should return -1 as the time limit has been reached/passed
Calling count when multiple keys have expired but not been explicitly removedThe count function should iterate through the keys and check if the expiration time has passed, only counting the valid ones.
Simultaneous set and get operations on the same key from different threadsImplement thread safety using locks or other synchronization mechanisms to prevent race conditions and ensure data consistency.