Taro Logo

LRU Cache #1 Most Asked

Medium
1 view
Topics:
Linked Lists

Design a data structure that follows the constraints of a Least Recently Used (LRU) cache.

Implement the LRUCache class:

  • LRUCache(int capacity) Initialize the LRU cache with positive size capacity.
  • int get(int key) Return the value of the key if the key exists, otherwise return -1.
  • void put(int key, int value) Update the value of the key if the key exists. Otherwise, add the key-value pair to the cache. If the number of keys exceeds the capacity from this operation, evict the least recently used key.

The functions get and put must each run in O(1) average time complexity.

Example 1:

Input
["LRUCache", "put", "put", "get", "put", "get", "put", "get", "get", "get"]
[[2], [1, 1], [2, 2], [1], [3, 3], [2], [4, 4], [1], [3], [4]]
Output
[null, null, null, 1, null, -1, null, -1, 3, 4]

Explanation
LRUCache lRUCache = new LRUCache(2);
lRUCache.put(1, 1); // cache is {1=1}
lRUCache.put(2, 2); // cache is {1=1, 2=2}
lRUCache.get(1);    // return 1
lRUCache.put(3, 3); // LRU key was 2, evicts key 2, cache is {1=1, 3=3}
lRUCache.get(2);    // returns -1 (not found)
lRUCache.put(4, 4); // LRU key was 1, evicts key 1, cache is {4=4, 3=3}
lRUCache.get(1);    // return -1 (not found)
lRUCache.get(3);    // return 3
lRUCache.get(4);    // return 4

Constraints:

  • 1 <= capacity <= 3000
  • 0 <= key <= 104
  • 0 <= value <= 105
  • At most 2 * 105 calls will be made to get and put.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the expected data types for the keys and values? For instance, are they restricted to integers, or can they be other types like strings or custom objects?
  2. Is the capacity guaranteed to be a positive integer, or should I handle cases where the capacity is zero or negative?
  3. When a key is updated using `put` with a new value, should that key be considered the most recently used item?
  4. What is the expected behavior if a `put` operation is called with a key that already exists but the cache has a capacity of 1?
  5. Are there any constraints on the range or size of the keys and values that I should be aware of?

Brute Force Solution

Approach

Imagine we have a collection of items with a strict limit on how many we can hold. To solve this, we'll keep a simple list of all our items and manually track how recently each one was used every single time we add or access an item.

Here's how the algorithm would work step-by-step:

  1. First, we'll just store our key-value items in a simple, unordered list.
  2. We also need another list to keep track of the usage history of each item, noting down the order in which they were accessed.
  3. When we need to find an item, we'll look through our entire collection from beginning to end until we find it. If we find it, we'll update our usage history list to show that this item is now the most recently used.
  4. When we want to add a new item, we first check if it's already in our collection by searching through everything.
  5. If it's already there, we just update its value and mark it as the most recently used in our history list.
  6. If it's a completely new item and our collection is full, we have to make space. We'll look through our entire usage history to find the one item that was used the longest time ago.
  7. Once we identify the least recently used item, we remove it from our collection.
  8. Finally, we can add the new item to our collection and update our history list to show it's the newest, most recently used one.

Code Implementation

class LRUCacheBruteForce:

    def __init__(self, capacity):
        self.capacity = capacity
        # Use a list of tuples to store key-value pairs as described.
        self.item_collection = []
        # A separate list tracks the order of key usage.
        self.usage_history = []

    def get(self, key):
        found_index = -1
        for index, (item_key, item_value) in enumerate(self.item_collection):
            if item_key == key:
                found_index = index
                break

        if found_index != -1:
            # The key must be marked as most recently used by moving it in the history list.
            self.usage_history.remove(key)
            self.usage_history.append(key)
            return self.item_collection[found_index][1]
        
        return -1

    def put(self, key, value):
        found_index = -1
        for index, (item_key, item_value) in enumerate(self.item_collection):
            if item_key == key:
                found_index = index
                break

        if found_index != -1:
            self.item_collection[found_index] = (key, value)
            # An existing item's access also updates its position in the usage history.
            self.usage_history.remove(key)
            self.usage_history.append(key)
        else:
            if len(self.item_collection) >= self.capacity:
                # To make space, we must identify and remove the least recently used item.
                key_to_evict = self.usage_history[0]
                self.usage_history.pop(0)

                eviction_index = -1
                for index, (item_key, item_value) in enumerate(self.item_collection):
                    if item_key == key_to_evict:
                        eviction_index = index
                        break
                self.item_collection.pop(eviction_index)

            self.item_collection.append((key, value))
            self.usage_history.append(key)

Big(O) Analysis

Time Complexity
O(c)The time complexity is driven by the need to scan through lists for every operation, where `c` is the cache capacity. For both `get` and `put` operations, we must search the entire collection of up to `c` items to find a key or determine the least recently used item. Each search requires iterating through the list, resulting in a linear number of steps proportional to the capacity. Therefore, the operations are approximately `c` steps, which simplifies to O(c).
Space Complexity
O(C)The solution uses two primary data structures to store information. The first is an unordered list to hold the key-value pairs, and the second is a list to track the usage history. Both of these lists grow to a maximum size equal to the cache's capacity, which we'll call C. Therefore, the total auxiliary space required is directly proportional to the capacity, resulting in O(C) space complexity.

Optimal Solution

Approach

The core idea is to use two structures working together: a quick lookup table to find items instantly, and an ordered list to keep track of how recently items were used. This combination allows us to find, use, and remove items very quickly without searching through the entire collection each time.

Here's how the algorithm would work step-by-step:

  1. Imagine you have a line of people, where the person at the front is the most recently used and the person at the back is the least recently used.
  2. You also have a directory that tells you exactly where each person is standing in the line, so you can find them instantly without having to ask everyone.
  3. When you need to get an item, you first check your directory. If it's not there, it's a miss. If it is there, you find it in the line, and move it to the very front to show it's now the most recently used.
  4. When you add a new item, you place it at the very front of the line because it's the newest. You also add its location to your directory.
  5. If adding a new item makes the line too long (exceeds your capacity), you must kick out the person at the very back of the line, as they are the least recently used.
  6. After kicking someone out, you must also remove their entry from your directory so you don't look for them later.
  7. By always updating the line and the directory together, you can instantly find any item and always know which one is the least recently used and ready to be removed.

Code Implementation

class Node:
    def __init__(self, key, value):
        self.key = key
        self.value = value
        self.previous_node = None
        self.next_node = None

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache_directory = {}
        self.head_sentinel = Node(0, 0)
        self.tail_sentinel = Node(0, 0)
        self.head_sentinel.next_node = self.tail_sentinel
        self.tail_sentinel.previous_node = self.head_sentinel

    def _remove_node(self, node_to_remove):
        previous_node = node_to_remove.previous_node
        next_node = node_to_remove.next_node
        previous_node.next_node = next_node
        next_node.previous_node = previous_node

    def _add_to_front(self, node_to_add):
        node_to_add.previous_node = self.head_sentinel
        node_to_add.next_node = self.head_sentinel.next_node
        self.head_sentinel.next_node.previous_node = node_to_add
        self.head_sentinel.next_node = node_to_add

    def get(self, key):
        if key in self.cache_directory:
            found_node = self.cache_directory[key]
            # Move the accessed node to the front to mark it as most recently used.

            self._remove_node(found_node)
            self._add_to_front(found_node)
            return found_node.value
        return -1

    def put(self, key, value):
        if key in self.cache_directory:
            node_to_update = self.cache_directory[key]
            node_to_update.value = value
            self._remove_node(node_to_update)
            self._add_to_front(node_to_update)
            return

        # If the cache is full, we must evict the least recently used item before adding a new one.

        if len(self.cache_directory) == self.capacity:
            least_recently_used_node = self.tail_sentinel.previous_node
            self._remove_node(least_recently_used_node)
            # Remove the evicted item's entry from the directory to maintain consistency.

            del self.cache_directory[least_recently_used_node.key]

        new_node = Node(key, value)
        self.cache_directory[key] = new_node
        # Place the new item at the front, as it is now the most recently used item.

        self._add_to_front(new_node)

Big(O) Analysis

Time Complexity
O(1)The solution uses a hash map for instant lookups and a doubly linked list to maintain the order of use. The `get` operation involves a hash map lookup and moving a node to the front of the list, both of which are constant time operations. The `put` operation involves a hash map insertion, adding a new node to the front of the list, and potentially removing the last node and its corresponding hash map entry, all of which are also constant time operations. Since each core operation takes a fixed amount of time regardless of the cache's capacity, the time complexity is O(1).
Space Complexity
O(capacity)The space complexity is determined by the two data structures used to implement the cache. We use a 'directory', which is a hash map, and a 'line', which is a doubly linked list, to store the cache items. In the worst-case scenario, both of these structures will grow to store a number of elements equal to the cache's specified capacity. Therefore, the auxiliary space required is directly proportional to the capacity, resulting in O(capacity) space complexity.

Edge Cases

Initializing the cache with a capacity of zero
How to Handle:
The cache should never store any items, so any put operation should immediately evict the new item and get should always return -1.
Initializing the cache with a capacity of one
How to Handle:
The cache can only hold a single item, so every put operation either updates the existing item or replaces it with the new one.
Performing a get operation on an empty cache
How to Handle:
The get operation must return -1 as the key cannot exist in an empty cache.
Performing a put operation on a key that already exists
How to Handle:
The existing value should be updated, and the key should be marked as the most recently used item without changing the cache size.
Performing a put operation when the cache is full with a new key
How to Handle:
The least recently used item must be evicted from both the hash map and the doubly linked list before the new item is inserted.
Performing a get operation on a key that exists
How to Handle:
This operation should mark the accessed key as the most recently used item by moving its node to the head of the list.
A rapid sequence of put operations that exceeds the capacity
How to Handle:
The solution must efficiently evict the LRU item for each new insertion after the cache reaches its capacity limit.
Getting the least recently used item and then immediately putting a new item
How to Handle:
Getting the LRU item makes it the MRU item, so the subsequent put operation should evict the new LRU item, which was previously the second least recently used.
0/0 completed