Taro Logo

LRU Cache

#7 Most AskedMedium
13 views
Topics:
Linked ListsDesign

Design a data structure that follows the constraints of a Least Recently Used (LRU) cache.

Implement the LRUCache class:

  • LRUCache(int capacity) Initialize the LRU cache with positive size capacity.
  • int get(int key) Return the value of the key if the key exists, otherwise return -1.
  • void put(int key, int value) Update the value of the key if the key exists. Otherwise, add the key-value pair to the cache. If the number of keys exceeds the capacity from this operation, evict the least recently used key.

The functions get and put must each run in O(1) average time complexity.

Example 1:

Input
["LRUCache", "put", "put", "get", "put", "get", "put", "get", "get", "get"]
[[2], [1, 1], [2, 2], [1], [3, 3], [2], [4, 4], [1], [3], [4]]
Output
[null, null, null, 1, null, -1, null, -1, 3, 4]

Explanation
LRUCache lRUCache = new LRUCache(2);
lRUCache.put(1, 1); // cache is {1=1}
lRUCache.put(2, 2); // cache is {1=1, 2=2}
lRUCache.get(1);    // return 1
lRUCache.put(3, 3); // LRU key was 2, evicts key 2, cache is {1=1, 3=3}
lRUCache.get(2);    // returns -1 (not found)
lRUCache.put(4, 4); // LRU key was 1, evicts key 1, cache is {4=4, 3=3}
lRUCache.get(1);    // return -1 (not found)
lRUCache.get(3);    // return 3
lRUCache.get(4);    // return 4

Constraints:

  • 1 <= capacity <= 3000
  • 0 <= key <= 104
  • 0 <= value <= 105
  • At most 2 * 105 calls will be made to get and put.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the data types for the keys and values? For instance, are they restricted to integers, or can they be any type?
  2. What should the `get` method return if a key is not found in the cache?
  3. Does the capacity have any constraints? For example, can the capacity be zero or negative, and how should the cache behave in those cases?
  4. If we `put` a key that already exists, should this be treated as an update operation, and does it count as the most recent use of that key?
  5. Are there any specific constraints on the range of values for keys, values, or the capacity itself that might affect the choice of data structures?

Brute Force Solution

Approach

To manage a collection of items with a fixed size limit, the brute force approach keeps a simple list of all the items. Every time an action happens, like adding or retrieving an item, we go through the entire list to update it and find the oldest item to remove when needed.

Here's how the algorithm would work step-by-step:

  1. First, think of our cache as a single, ordered line of items.
  2. When we need to find an item, we will look at every single item in the line, one by one, from the beginning until we find it.
  3. If we find the item, we take it out from its current spot and move it to the very front of the line to show it was just used.
  4. When we want to add a new item, we first check if it's already somewhere in our line by looking through every existing item.
  5. If it's already there, we just move it to the front of the line.
  6. If the new item is not in the line, we check if the line is full. If there's space, we simply place the new item at the front.
  7. If the line is full and we need to add a new item, we must make space. To do this, we search through the entire line to find the one that has been sitting there the longest, which will be the one at the very end. We remove that last one.
  8. After removing the oldest item, we then place the new item at the very front of the line.

Code Implementation

class LRUCacheBruteForce:

    def __init__(self, capacity_limit: int):
        self.capacity_limit = capacity_limit
        self.cache_list = []  # Represents a key-value pair as a list [key, value]

    def get(self, key_to_find: int) -> int:
        item_to_move = None
        item_index = -1

        for current_index, cache_item in enumerate(self.cache_list):
            if cache_item[0] == key_to_find:
                item_to_move = cache_item
                item_index = current_index
                break
        
        # If the item is found, it must be moved to the front to mark it as recently used.

        if item_to_move:
            self.cache_list.pop(item_index)
            self.cache_list.insert(0, item_to_move)
            return item_to_move[1]
        
        return -1

    def put(self, key_to_add: int, value_to_add: int) -> None:
        existing_item_index = -1
        for current_index, cache_item in enumerate(self.cache_list):
            if cache_item[0] == key_to_add:
                existing_item_index = current_index
                break

        if existing_item_index != -1:
            # If the key already exists, we remove it to re-insert it at the front later.

            self.cache_list.pop(existing_item_index)
        elif len(self.cache_list) >= self.capacity_limit:
            # If the cache is full, the least recently used item (at the end) must be removed.

            self.cache_list.pop()

        # The new or updated item is placed at the front, marking it as most recently used.

        self.cache_list.insert(0, [key_to_add, value_to_add])

Big(O) Analysis

Time Complexity
O(n)The time complexity is driven by the need to scan the entire list for each operation. In the worst-case scenario for both get and put operations, we must iterate through all n items in the cache to find a specific key or to locate the least recently used item at the end. For a put operation that finds the key, we scan the list to find it (O(n)) and then perform another O(n) operation to remove it and re-insert it at the front. Therefore, the number of operations is directly proportional to n, the capacity of the cache, which simplifies to O(n).
Space Complexity
O(capacity)The brute force solution maintains a single ordered list to store the cache items. The size of this list is determined by the cache's fixed capacity, which dictates the maximum number of items it can hold. Therefore, the auxiliary space required is directly proportional to the specified capacity of the cache. As the capacity increases, the memory used by this list grows linearly.

Optimal Solution

Approach

To build an efficient LRU Cache, we combine two simple ideas: a quick way to look things up and a way to keep track of the most recently used items. This lets us find items instantly and also know which one to remove when the cache is full.

Here's how the algorithm would work step-by-step:

  1. First, think of having a quick-access directory where we can look up any item by its name to find its value and its exact spot in our usage history.
  2. Next, imagine all the items are arranged in a single line, like people waiting for a bus. This line represents their usage order.
  3. Whenever we add a new item or use an existing one, we consider it the 'most recently used'. To show this, we move it to the very front of the line.
  4. If we need to use an item that's already in the cache, we first find it using our quick-access directory. Then, we take it from its current position and move it to the front of the line to mark it as the newest.
  5. If we need to add a brand-new item and the cache is full, we must make space.
  6. To make space, we remove the item at the very back of the line. This is the 'least recently used' one because it hasn't been touched for the longest time.
  7. After removing the oldest item from the back of the line, we also remove its entry from our quick-access directory.
  8. Finally, we can add the new item. We place it at the front of the line and add it to our directory so we can find it easily later.

Code Implementation

class Node:
    def __init__(self, key, value):
        self.key = key
        self.value = value
        self.previous_node = None
        self.next_node = None

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache_directory = {}
        self.head_node = Node(0, 0) 
        self.tail_node = Node(0, 0)
        self.head_node.next_node = self.tail_node
        self.tail_node.previous_node = self.head_node

    def _remove_node(self, node_to_remove):
        previous_one = node_to_remove.previous_node
        next_one = node_to_remove.next_node
        previous_one.next_node = next_one
        next_one.previous_node = previous_one

    def _add_to_front(self, node_to_add):
        node_to_add.previous_node = self.head_node
        node_to_add.next_node = self.head_node.next_node
        self.head_node.next_node.previous_node = node_to_add
        self.head_node.next_node = node_to_add

    def get(self, key_to_find):
        if key_to_find in self.cache_directory:
            node_to_move = self.cache_directory[key_to_find]
            # This item was just used, so it must be moved to the front to mark it as most recent.
            self._remove_node(node_to_move)
            self._add_to_front(node_to_move)
            return node_to_move.value
        return -1

    def put(self, key_to_add, value_to_add):
        if key_to_add in self.cache_directory:
            node_to_update = self.cache_directory[key_to_add]
            self._remove_node(node_to_update)
            node_to_update.value = value_to_add
            self._add_to_front(node_to_update)
            return

        # If the cache is full, we must remove the least recently used item to make space.
        if len(self.cache_directory) == self.capacity:

            least_recent_node = self.tail_node.previous_node
            self._remove_node(least_recent_node)
            # The evicted item's key must also be removed from the directory to maintain consistency.
            del self.cache_directory[least_recent_node.key]

        new_node = Node(key_to_add, value_to_add)
        self.cache_directory[key_to_add] = new_node
        self._add_to_front(new_node)

Big(O) Analysis

Time Complexity
O(1)The solution uses a hash map for the quick-access directory and a doubly linked list for the usage order. The `get` operation involves a hash map lookup and moving a node to the front of the list, both of which are constant time operations. The `put` operation also consists of constant time steps: a hash map lookup, potentially removing the last node (tail) from the list, adding a new node to the front (head), and updating the hash map. Since every action is performed in a fixed number of steps regardless of the cache's capacity, the time complexity for both `get` and `put` is O(1).
Space Complexity
O(C)The space complexity is determined by the two core data structures used to implement the cache. We use a 'quick-access directory', which is a hash map, and a 'single line' representing usage order, implemented as a doubly linked list. Both of these structures will store at most a number of elements equal to the cache's capacity, which we can denote as C. Therefore, the total auxiliary space required grows linearly with the capacity of the cache.

Edge Cases

Initializing the cache with a capacity of zero
How to Handle:
The implementation should handle this by immediately evicting any new item that is put into the cache.
Initializing the cache with a capacity of one
How to Handle:
The cache should correctly evict the single existing item whenever a new, different item is put.
Repeatedly calling 'put' with the same key but different values
How to Handle:
The value associated with the key should be updated, and the key should be marked as the most recently used.
Calling 'get' on a key that does not exist in the cache
How to Handle:
The method should return a specific value, typically -1, to indicate that the key was not found.
A 'put' operation that causes the cache to reach its capacity
How to Handle:
The new item should be added without evicting any existing items, as the cache is now exactly full.
A 'put' operation on a full cache with a new key
How to Handle:
The least recently used item must be evicted before the new key-value pair is inserted.
A 'get' or 'put' operation on a key that is currently the least recently used item
How to Handle:
This access promotes the item to be the most recently used, preventing its immediate eviction.
Using keys or values that are zero or negative numbers
How to Handle:
The data structure should handle any valid integer keys and values without special logic.
0/12 completed