Taro Logo

Insert Delete GetRandom O(1)

Medium
Grammarly logo
Grammarly
1 view
Topics:
ArraysHash Table

Implement the RandomizedSet class:

  • RandomizedSet() Initializes the RandomizedSet object.
  • bool insert(int val) Inserts an item val into the set if not present. Returns true if the item was not present, false otherwise.
  • bool remove(int val) Removes an item val from the set if present. Returns true if the item was present, false otherwise.
  • int getRandom() Returns a random element from the current set of elements (it's guaranteed that at least one element exists when this method is called). Each element must have the same probability of being returned.

You must implement the functions of the class such that each function works in average O(1) time complexity.

Example 1:

Input
["RandomizedSet", "insert", "remove", "insert", "getRandom", "remove", "insert", "getRandom"]
[[], [1], [2], [2], [], [1], [2], []]
Output
[null, true, false, true, 2, true, false, 2]

Explanation
RandomizedSet randomizedSet = new RandomizedSet();
randomizedSet.insert(1); // Inserts 1 to the set. Returns true as 1 was inserted successfully.
randomizedSet.remove(2); // Returns false as 2 does not exist in the set.
randomizedSet.insert(2); // Inserts 2 to the set, returns true. Set now contains [1,2].
randomizedSet.getRandom(); // getRandom() should return either 1 or 2 randomly.
randomizedSet.remove(1); // Removes 1 from the set, returns true. Set now contains [2].
randomizedSet.insert(2); // 2 was already in the set, so return false.
randomizedSet.getRandom(); // Since 2 is the only number in the set, getRandom() will always return 2.

Constraints:

  • -231 <= val <= 231 - 1
  • At most 2 * 105 calls will be made to insert, remove, and getRandom.
  • There will be at least one element in the data structure when getRandom is called.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible values for the elements being inserted and removed?
  2. If `remove(val)` is called with a `val` that doesn't exist, should it return an error or do nothing?
  3. Is it acceptable if `getRandom()` is not perfectly uniformly distributed, as long as it's probabilistically random?
  4. Can I assume that `getRandom()` will only be called when the data structure is not empty?
  5. How many insert, delete, and getRandom operations should I expect to perform in total?

Brute Force Solution

Approach

We want to build a special data structure that lets us add numbers, remove numbers, and pick a random number, all very quickly. The brute force way is simple: just keep all the numbers in one big pile. We will use the most straightforward tools to do the adding, removing and random picking.

Here's how the algorithm would work step-by-step:

  1. To add a number, simply put it at the end of the pile.
  2. To remove a number, look through the entire pile until you find it. Once you find it, take it out and shift all the numbers behind it to fill the empty spot.
  3. To pick a random number, count how many numbers are in the pile. Then, randomly choose a spot in the pile, and return the number you find there.

Code Implementation

import random

class RandomizedSet:

    def __init__(self):
        self.number_list = []

    def insert(self, value):
        self.number_list.append(value)
        return True

    def remove(self, value):
        list_length = len(self.number_list)

        for index in range(list_length):
            if self.number_list[index] == value:
                #Removing element; shift elements back
                for shift_index in range(index, list_length - 1):
                    self.number_list[shift_index] = self.number_list[shift_index + 1]

                self.number_list.pop()
                return True

        return False

    def getRandom(self):
        #Return random element from the list
        random_index = random.randint(0, len(self.number_list) - 1)

        return self.number_list[random_index]

Big(O) Analysis

Time Complexity
O(n)The insert operation adds an element to the end of a list, which takes O(1) time. The getRandom operation accesses a random element in the list, also taking O(1) time. The remove operation requires searching for the element to be removed, which in the worst case iterates through all n elements of the list, resulting in O(n) time complexity. Therefore, the dominant operation is remove, leading to an overall time complexity of O(n).
Space Complexity
O(1)The described brute force approach primarily manipulates the existing input pile in place. While conceptually a 'pile' of numbers exists, the plain English explanation doesn't describe creating any new data structures to store temporary results or track information. The add operation modifies the pile directly, the remove operation shifts elements within the pile, and random selection only requires counting elements and generating a random index. Thus, no significant auxiliary space is needed, resulting in constant space complexity, O(1).

Optimal Solution

Approach

The goal is to make adding, removing, and picking a random item super fast. The trick is to combine a list (where we keep the order) with a way to jump directly to each item (like a phone book).

Here's how the algorithm would work step-by-step:

  1. To add a number, put it at the end of our list and remember where we put it in our 'phone book'.
  2. To remove a number, find it in the 'phone book' so you know where it is in the list.
  3. Instead of actually removing it and creating a gap, take the last number in the list and move it into the spot of the number you want to remove.
  4. Update the 'phone book' to show the new location of the number you moved.
  5. Then, shorten the list by one (because the last spot is now empty after the swap).
  6. To pick a random number, just pick a random spot in the list and take whatever number is there. This is fast because you know exactly where each number is located in the list.
  7. Because you are always swapping in elements rather than shifting them, these actions are super quick.

Code Implementation

import random

class RandomizedSet:

    def __init__(self):
        self.value_list = []
        self.value_index_map = {}

    def insert(self, value):
        if value in self.value_index_map:
            return False
        
        self.value_list.append(value)
        self.value_index_map[value] = len(self.value_list) - 1
        return True

    def remove(self, value):
        if value not in self.value_index_map:
            return False

        # Get the index of the value to be removed
        index_to_remove = self.value_index_map[value]

        # Get the last element in the list
        last_element = self.value_list[-1]

        # Swap the element to be removed with the last element
        self.value_list[index_to_remove] = last_element

        # Update the index of the last element in the dictionary
        self.value_index_map[last_element] = index_to_remove

        # Remove the value from the list
        self.value_list.pop()

        # Remove the value from the dictionary
        del self.value_index_map[value]
        return True

    def getRandom(self):
        #Get a random element from the list
        return random.choice(self.value_list)

# Your RandomizedSet object will be instantiated and called as such:
# randomizedSet = RandomizedSet()
# randomizedSet.insert(1)
# randomizedSet.remove(2)
# randomizedSet.insert(2)
# randomizedSet.getRandom()
# randomizedSet.remove(1)
# randomizedSet.insert(2)
# randomizedSet.getRandom()

Big(O) Analysis

Time Complexity
O(1)The insert operation appends to a list and updates a hash map, both of which take constant time. The delete operation involves a hash map lookup, a swap operation within the list, and a hash map update, all constant time operations. The getRandom operation involves generating a random index and accessing an element in the list, also constant time. Thus, regardless of the number of elements n, each operation takes a fixed amount of time, resulting in O(1) time complexity.
Space Complexity
O(N)The algorithm utilizes a list to store the numbers and a dictionary (referred to as the 'phone book') to map each number to its index in the list. In the worst case, the data structure will need to store N distinct numbers where N is the number of insert operations performed. Therefore, the auxiliary space required for the list and the dictionary grows linearly with the number of elements inserted, resulting in O(N) space complexity.

Edge Cases

CaseHow to Handle
Inserting the same value multiple timesThe insert function should only insert the value once, returning false for subsequent identical inserts.
Deleting a value that doesn't existThe delete function should return false if the value to be deleted is not present.
Calling getRandom on an empty setThe getRandom function should ideally throw an exception, as the problem states it's guaranteed to have at least one element, but defensively throwing an exception is a good practice in case of unforeseen empty set.
Large number of insert and delete operations causing fragmentation in the underlying data structure.The implementation should be optimized to minimize memory fragmentation, potentially using techniques like swapping the element to be deleted with the last element of the list.
Integer overflow when generating random numbers (language specific)Ensure the random number generator produces values within the appropriate range to avoid integer overflow issues specific to the chosen programming language.
Extreme boundary values for the inserted numbers (e.g., Integer.MAX_VALUE, Integer.MIN_VALUE)Ensure that the data structures (e.g., hash map) can handle the full range of possible input values without issues like hashing collisions.
The data structure used to store the elements reaches its maximum capacityConsider using a dynamically resizing array or other suitable data structure that can handle an unbounded number of elements without exceeding memory limits or facing overflow.
Memory constraints for very large input datasetsThe solution should be memory-efficient, avoiding unnecessary memory allocations and utilizing data structures that minimize memory usage.