Taro Logo

Insert Delete GetRandom O(1) - Duplicates allowed

Hard
Affirm logo
Affirm
2 views
Topics:
ArraysHash TableDesign

RandomizedCollection is a data structure that contains a collection of numbers, possibly duplicates (i.e., a multiset). It should support inserting and removing specific elements and also reporting a random element.

Implement the RandomizedCollection class:

  • RandomizedCollection() Initializes the empty RandomizedCollection object.
  • bool insert(int val) Inserts an item val into the multiset, even if the item is already present. Returns true if the item is not present, false otherwise.
  • bool remove(int val) Removes an item val from the multiset if present. Returns true if the item is present, false otherwise. Note that if val has multiple occurrences in the multiset, we only remove one of them.
  • int getRandom() Returns a random element from the current multiset of elements. The probability of each element being returned is linearly related to the number of the same values the multiset contains.

You must implement the functions of the class such that each function works on average O(1) time complexity.

Note: The test cases are generated such that getRandom will only be called if there is at least one item in the RandomizedCollection.

Example 1:

Input
["RandomizedCollection", "insert", "insert", "insert", "getRandom", "remove", "getRandom"]
[[], [1], [1], [2], [], [1], []]
Output
[null, true, false, true, 2, true, 1]

Explanation
RandomizedCollection randomizedCollection = new RandomizedCollection();
randomizedCollection.insert(1);   // return true since the collection does not contain 1.
                                  // Inserts 1 into the collection.
randomizedCollection.insert(1);   // return false since the collection contains 1.
                                  // Inserts another 1 into the collection. Collection now contains [1,1].
randomizedCollection.insert(2);   // return true since the collection does not contain 2.
                                  // Inserts 2 into the collection. Collection now contains [1,1,2].
randomizedCollection.getRandom(); // getRandom should:
                                  // - return 1 with probability 2/3, or
                                  // - return 2 with probability 1/3.
randomizedCollection.remove(1);   // return true since the collection contains 1.
                                  // Removes 1 from the collection. Collection now contains [1,2].
randomizedCollection.getRandom(); // getRandom should return 1 or 2, both equally likely.

Constraints:

  • -231 <= val <= 231 - 1
  • At most 2 * 105 calls in total will be made to insert, remove, and getRandom.
  • There will be at least one element in the data structure when getRandom is called.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible values for the integers being inserted?
  2. If `remove(val)` is called and `val` appears multiple times, should only one instance be removed, and if so, which one?
  3. How often will `getRandom()` be called relative to `insert()` and `remove()`? Should I optimize for reads or writes?
  4. Can I assume that `getRandom()` will only be called when the data structure is non-empty?
  5. Can the input integers be non-integer (e.g. floating-point)?

Brute Force Solution

Approach

The brute force method for this problem involves directly mimicking the operations of insert, delete, and getRandom. We will use a simple list to store the numbers, and for each operation, perform the most straightforward action possible.

Here's how the algorithm would work step-by-step:

  1. To insert a number, just add it to the end of our list. No need to check if it's already there.
  2. To delete a number, we need to find it in the list. We go through the list, one by one, looking for the number we want to remove.
  3. If we find the number, we remove it from the list. If the same number shows up more than once, we only remove the first one we find.
  4. If we can't find the number in the list, we don't do anything.
  5. To get a random number, we pick any position in the list completely at random and return the number in that position. Every position has an equal chance of being chosen.

Code Implementation

import random

class RandomizedCollection:

    def __init__(self):
        self.number_list = []

    def insert(self, value):
        self.number_list.append(value)
        return True

    def remove(self, value):
        # We need to iterate and remove the FIRST occurence only
        for index, number in enumerate(self.number_list):
            if number == value:
                del self.number_list[index]

                # We only delete the first one
                return True

        return False

    def getRandom(self):
        # Pick a random element from the list
        random_index = random.randint(0, len(self.number_list) - 1)

        return self.number_list[random_index]

Big(O) Analysis

Time Complexity
O(n)The insert operation takes O(1) time as it simply appends the element to the end of the list. The delete operation requires iterating through the list to find the first occurrence of the number, which takes O(n) time in the worst case, where n is the number of elements in the list. The getRandom operation takes O(1) time since accessing an element by its index is a constant-time operation, and generating a random index also takes constant time. Therefore, the overall time complexity is dominated by the delete operation, which is O(n).
Space Complexity
O(1)The brute force method, as described, uses a simple list to store the numbers. The insert, delete, and getRandom operations are performed directly on this list. No additional data structures like hash maps or temporary lists are created beyond the original list used to store the input. Therefore, the auxiliary space used is constant, irrespective of the number of elements, N, inserted. This means the space complexity is O(1).

Optimal Solution

Approach

The key is to use two data structures working together. One remembers the exact position of each number, and the other lets us quickly swap and remove numbers when needed, even with duplicates.

Here's how the algorithm would work step-by-step:

  1. Keep a list of all the numbers that are present.
  2. Also, keep a way to find out where each number is located in the list. Think of it like a phone book that tells you the position of each person.
  3. When you insert a number, add it to the end of the list and update its location in your 'phone book'.
  4. When you remove a number, first find its location in the list using your 'phone book'.
  5. Then, swap this number with the last number in the list. This moves the number you want to remove to the end.
  6. Remove the last number from the list (which is the one you wanted to delete). Also, update the 'phone book' to reflect the swap.
  7. To pick a random number, choose a random position in the list and return the number at that position. Since it's a list, you can do this quickly.
  8. If there are duplicates in the data structure make sure that you remove the first occurence of that value.

Code Implementation

import random

class RandomizedCollection:

    def __init__(self):
        self.number_list = []
        self.value_to_indices = {}

    def insert(self, value):
        self.number_list.append(value)
        if value not in self.value_to_indices:
            self.value_to_indices[value] = set()
        self.value_to_indices[value].add(len(self.number_list) - 1)
        return len(self.value_to_indices[value]) == 1

    def remove(self, value):
        if value not in self.value_to_indices or not self.value_to_indices[value]:
            return False

        # Get an index of the value to remove.
        index_to_remove = self.value_to_indices[value].pop()

        # If the element to remove is not the last element
        if index_to_remove < len(self.number_list) - 1:
            last_element = self.number_list[-1]

            # Move the last element to the position of the element to remove.
            self.number_list[index_to_remove] = last_element

            # Update index of the moved last element
            self.value_to_indices[last_element].add(index_to_remove)

            # Remove index of the moved last element
            self.value_to_indices[last_element].remove(len(self.number_list) - 1)

        # Remove the last element from the list and update the indices.
        self.number_list.pop()

        # Clean up if the value is no longer present
        if not self.value_to_indices[value]:
            del self.value_to_indices[value]

        return True

    def getRandom(self):
        # Get a random element from the list.
        return random.choice(self.number_list)

Big(O) Analysis

Time Complexity
O(1)Insertion appends to a list and updates a hash map, both constant time operations. Deletion finds the index in the list using the hash map (O(1)), swaps it with the last element (O(1)), removes the last element (O(1)), and updates the hash map for the swapped element (O(1)). GetRandom accesses a random index in the list, which is O(1). Therefore, all operations have a constant time complexity regardless of the number of elements (n) in the data structure.
Space Complexity
O(N)The solution utilizes a list to store all inserted numbers and a hash map (the 'phone book') to store the indices of each number in the list. In the worst-case scenario, where all inserted numbers are unique, both the list and the hash map will store N elements, where N is the number of insert operations. Therefore, the auxiliary space used is proportional to N. Hence, the space complexity is O(N).

Edge Cases

CaseHow to Handle
Inserting null or invalid value (e.g., non-integer)Raise an exception or return an error code to indicate invalid input.
Inserting a very large number of duplicate valuesThe HashMap/ArrayList combination should scale linearly with the number of inserted elements, however very large numbers may cause memory constraints.
Removing a value that doesn't exist in the collectionReturn false or do nothing if the value is not present.
getRandom() called on an empty collection after multiple insertions and removalsThrow an exception, return null, or some other appropriate error handling mechanism, as the prompt stated this condition would not exist; this tests for correct remove logic.
Removing the last occurrence of a specific valueEnsure that both the list and the hashmap are updated correctly when the last instance is removed.
Integer overflow in the value being inserted or removedUse appropriate data types (e.g., long) or validate input to prevent overflows; alternatively consider using a language with built-in overflow protection.
Extreme boundary values (Integer.MAX_VALUE, Integer.MIN_VALUE)The data structures should handle these values correctly without causing unexpected behavior.
Negative numbers are insertedThe data structures should be able to store and retrieve negative numbers correctly.