All O`one Data Structure

Hard

Asked by:

26 views

Topics:

StringsGreedy AlgorithmsLinked Lists

Design a data structure to store the strings' count with the ability to return the strings with minimum and maximum counts.

Implement the AllOne class:

AllOne() Initializes the object of the data structure.
inc(String key) Increments the count of the string key by 1. If key does not exist in the data structure, insert it with count 1.
dec(String key) Decrements the count of the string key by 1. If the count of key is 0 after the decrement, remove it from the data structure. It is guaranteed that key exists in the data structure before the decrement.
getMaxKey() Returns one of the keys with the maximal count. If no element exists, return an empty string "".
getMinKey() Returns one of the keys with the minimum count. If no element exists, return an empty string "".

Note that each function must run in O(1) average time complexity.

Example 1:

 Input ["AllOne", "inc", "inc", "getMaxKey", "getMinKey", "inc", "getMaxKey", "getMinKey"] [[], ["hello"], ["hello"], [], [], ["leet"], [], []] Output [null, null, null, "hello", "hello", null, "hello", "leet"] Explanation AllOne allOne = new AllOne(); allOne.inc("hello"); allOne.inc("hello"); allOne.getMaxKey(); // return "hello" allOne.getMinKey(); // return "hello" allOne.inc("leet"); allOne.getMaxKey(); // return "hello" allOne.getMinKey(); // return "leet"

Constraints:

1 <= key.length <= 10
key consists of lowercase English letters.
It is guaranteed that for each call to dec, key is existing in the data structure.
At most 5 * 10⁴ calls will be made to inc, dec, getMaxKey, and getMinKey.

Try coding it in LeetCode

Solution

Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

If multiple keys have the same maximum or minimum count, is it acceptable to return any one of them, or is there a specific tie-breaking rule to follow?
The problem states that `key` exists for every `dec` call. Just to confirm, this means a key's count will always be at least 1 when `dec` is called on it, correct?
Should the data structure be designed to be thread-safe, or can I assume it will only be used in a single-threaded context?
The requirement is for O(1) average time complexity. Does this mean that using data structures like hash maps, which offer amortized O(1) time for insertions and deletions, is an acceptable approach?
What is the expected maximum number of unique keys that could be stored simultaneously? Is it proportional to the total number of calls?

Brute Force Solution

Approach

The brute force method involves keeping a straightforward record of each item and its count. To find the item with the highest or lowest count, we simply look through the entire record every single time, comparing each item's count one by one until we find the one we need.

Here's how the algorithm would work step-by-step:

First, imagine keeping a simple list of every item and a number next to it representing its count.
When we are asked to increase an item's count, we find that item in our list and add one to its number. If it's a new item, we add it to the list with a count of one.
When asked to decrease an item's count, we find it and subtract one from its number. If its count becomes zero, we remove the item from our list completely.
To find the item that has appeared most often, we must perform an exhaustive search of our entire list.
We start at the beginning of the list and go through every single item, one by one, keeping track of which one has the highest count we've seen so far.
After we've checked every item, the one we remembered is our answer for the most frequent item.
Similarly, to find the item that has appeared least often, we repeat the process of searching the entire list, but this time we remember the item with the lowest count.

Code Implementation

class AllOne:

    def __init__(self):
        # This list of tuples acts as our record book, directly matching the brute force strategy.

        self.items_and_their_counts = []

    def inc(self, key_string: str) -> None:
        key_was_found_in_list = False
        # To update an item, we must first search the entire list to find its entry.

        for index_in_list, (current_key, current_count) in enumerate(self.items_and_their_counts):
            if current_key == key_string:
                self.items_and_their_counts[index_in_list] = (current_key, current_count + 1)
                key_was_found_in_list = True
                break
        
        if not key_was_found_in_list:
            self.items_and_their_counts.append((key_string, 1))

    def dec(self, key_string: str) -> None:
        # Decrementing also requires searching the list to find the key to update or remove.

        for index_in_list, (current_key, current_count) in enumerate(self.items_and_their_counts):
            if current_key == key_string:
                if current_count == 1:
                    self.items_and_their_counts.pop(index_in_list)
                else:
                    self.items_and_their_counts[index_in_list] = (current_key, current_count - 1)
                break

    def getMaxKey(self) -> str:
        if not self.items_and_their_counts:
            return ""

        # Finding the maximum requires a full scan, comparing each item's count along the way.

        key_with_max_count = ""
        max_count_so_far = -1
        for current_key, current_count in self.items_and_their_counts:
            if current_count > max_count_so_far:
                max_count_so_far = current_count
                key_with_max_count = current_key
        return key_with_max_count

    def getMinKey(self) -> str:
        if not self.items_and_their_counts:
            return ""

        key_with_min_count = ""
        min_count_so_far = float('inf')
        for current_key, current_count in self.items_and_their_counts:
            if current_count < min_count_so_far:
                min_count_so_far = current_count
                key_with_min_count = current_key
        return key_with_min_count

Big(O) Analysis

Time Complexity

O(n) – The dominant cost of this approach comes from the getMaxKey and getMinKey functions, where n is the number of unique keys stored. For these operations, we must perform an exhaustive search through all n keys in our records to find the one with the highest or lowest count. This process involves a single pass over the data, comparing each key's count to the maximum or minimum seen so far. The total number of operations is therefore directly proportional to n, which results in a time complexity of O(n).

Space Complexity

O(N) – The space complexity is determined by the record used to store each item and its count. In the worst-case scenario, every operation could add a new, unique item to this record. If N represents the maximum number of unique items stored at any given time, the space required will grow linearly with N. This is because we need to maintain a key-value pair for each unique item currently in the data structure.

Optimal Solution

Approach

The core idea is to maintain groups of items that have the same score, or count. These groups are kept in a sorted line, from lowest score to highest. When an item's score changes, it simply moves from its current group to the next or previous one, making it easy to find the items with the highest and lowest scores.

Here's how the algorithm would work step-by-step:

Imagine you have a series of buckets, arranged in a line. Each bucket is for items with a specific count, like a 'Count of 1' bucket, a 'Count of 2' bucket, and so on, all in numerical order.
We also need a quick way to find any item and know which bucket it belongs to. Think of this as a separate directory or catalog.
When an item's count increases, we find its current bucket using our directory. We then move the item to the next bucket up in the line, which represents the new, higher count.
If the bucket for that new, higher count doesn't exist yet, we simply create it on the spot and place it in the correct position in our line of buckets.
Similarly, when an item's count decreases, we move it to the previous bucket in the line. If that bucket doesn't exist, we create it right where it needs to go.
An important cleanup step: if moving an item causes its old bucket to become empty, we remove that empty bucket. If an item's count drops to zero, we remove the item from our system entirely.
With this organized line of buckets, finding an item with the highest count is simple: just look at the very last bucket. For the lowest count, just look at the very first bucket.

Code Implementation

class FrequencyBucket:
    def __init__(self, frequency):
        self.frequency = frequency
        self.key_set = set()
        self.previous_bucket = None
        self.next_bucket = None

class AllOne:

    def __init__(self):
        self.key_to_bucket_map = {}
        self.head = FrequencyBucket(0)
        self.tail = FrequencyBucket(0)
        self.head.next_bucket = self.tail
        self.tail.previous_bucket = self.head

    def _insert_after(self, existing_bucket, new_bucket):
        new_bucket.previous_bucket = existing_bucket
        new_bucket.next_bucket = existing_bucket.next_bucket
        existing_bucket.next_bucket.previous_bucket = new_bucket
        existing_bucket.next_bucket = new_bucket

    def _remove_bucket(self, bucket_to_remove):
        bucket_to_remove.previous_bucket.next_bucket = bucket_to_remove.next_bucket
        bucket_to_remove.next_bucket.previous_bucket = bucket_to_remove.previous_bucket

    def inc(self, key: str) -> None:
        if key not in self.key_to_bucket_map:
            current_bucket = self.head
        else:
            current_bucket = self.key_to_bucket_map[key]
            current_bucket.key_set.remove(key)
        
        new_frequency = current_bucket.frequency + 1
        
        # Find or create the next frequency bucket to move the key into, maintaining sorted order.
        if current_bucket.next_bucket.frequency != new_frequency:
            new_bucket = FrequencyBucket(new_frequency)
            self._insert_after(current_bucket, new_bucket)
        
        destination_bucket = current_bucket.next_bucket
        destination_bucket.key_set.add(key)
        self.key_to_bucket_map[key] = destination_bucket

        # An empty bucket no longer serves a purpose and must be removed to maintain the chain's integrity.
        if current_bucket != self.head and not current_bucket.key_set:
            self._remove_bucket(current_bucket)

    def dec(self, key: str) -> None:
        if key not in self.key_to_bucket_map:
            return
        
        current_bucket = self.key_to_bucket_map[key]
        current_bucket.key_set.remove(key)
        new_frequency = current_bucket.frequency - 1
        
        if new_frequency > 0:
            if current_bucket.previous_bucket.frequency != new_frequency:
                new_bucket = FrequencyBucket(new_frequency)
                self._insert_after(current_bucket.previous_bucket, new_bucket)
            
            destination_bucket = current_bucket.previous_bucket
            destination_bucket.key_set.add(key)
            self.key_to_bucket_map[key] = destination_bucket
        else:
            # If a key's count drops to zero, we completely discard it from our lookup directory.
            del self.key_to_bucket_map[key]

        if not current_bucket.key_set and current_bucket != self.head:
            self._remove_bucket(current_bucket)

    def getMaxKey(self) -> str:
        if self.tail.previous_bucket == self.head:
            return ""
        return next(iter(self.tail.previous_bucket.key_set))

    def getMinKey(self) -> str:
        if self.head.next_bucket == self.tail:
            return ""
        return next(iter(self.head.next_bucket.key_set))

Big(O) Analysis

Time Complexity

O(1) – The solution uses a hash map for instant key lookups and a doubly linked list to organize count buckets, making all core actions constant time operations. When a key's count changes, finding its current bucket via the map is O(1), and moving it to an adjacent bucket in the list is also O(1). Inserting or deleting a bucket involves a few pointer reassignments in the doubly linked list, which is a fixed-cost operation. Since finding the min and max keys only requires looking at the head and tail of the list, every function completes in a constant number of steps, independent of the total number of keys stored.

Space Complexity

O(N) – The space complexity is determined by the structures that store the keys, where N is the maximum number of unique keys in the system. A 'directory or catalog' is maintained, which is a hash map mapping each of the N keys to its respective count and bucket. Additionally, the 'series of buckets' itself must store each of the N keys within its appropriate group. Since both the hash map and the total number of items stored across all buckets grow linearly with N, the overall space complexity is O(N).

Edge Cases

Case	How to Handle
Calling `getMaxKey` or `getMinKey` when the data structure is empty.	The solution must return an empty string as specified, which is handled by checking if the internal list of counts is empty.
Decrementing a key whose count is 1.	The key must be completely removed from all internal mappings, and its corresponding count node might also be deleted if it becomes empty.
The last key in a specific count bucket is moved due to an `inc` or `dec` operation.	The now-empty count bucket node must be removed from the linked list to maintain a compact and correct structure.
A key's count changes to a value for which no count bucket currently exists.	A new bucket node is created and inserted into the correct sorted position within the doubly linked list.
Operating on the single unique key present in the data structure.	The key is moved to a new count node and the old node is removed, correctly updating both min and max references.
Multiple keys exist with the same minimum or maximum count.	The implementation correctly returns any one of the valid keys from the set of keys in the min or max count node.
A new key is inserted when all existing keys have counts greater than 1.	The new key establishes a new minimum count of 1, requiring the creation and insertion of a new node at the front of the list.
Decrementing the only key that has the current minimum count.	The key moves to a new or existing bucket with a smaller count, which becomes the new definitive source for the minimum key.

Calling `getMaxKey` or `getMinKey` when the data structure is empty.

How to Handle:

The solution must return an empty string as specified, which is handled by checking if the internal list of counts is empty.

Decrementing a key whose count is 1.

How to Handle:

The key must be completely removed from all internal mappings, and its corresponding count node might also be deleted if it becomes empty.

The last key in a specific count bucket is moved due to an `inc` or `dec` operation.

How to Handle:

The now-empty count bucket node must be removed from the linked list to maintain a compact and correct structure.

A key's count changes to a value for which no count bucket currently exists.

How to Handle:

A new bucket node is created and inserted into the correct sorted position within the doubly linked list.

Operating on the single unique key present in the data structure.

How to Handle:

The key is moved to a new count node and the old node is removed, correctly updating both min and max references.

Multiple keys exist with the same minimum or maximum count.

How to Handle:

The implementation correctly returns any one of the valid keys from the set of keys in the min or max count node.

A new key is inserted when all existing keys have counts greater than 1.

How to Handle:

The new key establishes a new minimum count of 1, requiring the creation and insertion of a new node at the front of the list.

Decrementing the only key that has the current minimum count.

How to Handle:

The key moves to a new or existing bucket with a smaller count, which becomes the new definitive source for the minimum key.