Taro Logo

Design Skiplist

Hard
Google logo
Google
1 view
Topics:
Linked Lists

Design a Skiplist without using any built-in libraries.

A skiplist is a data structure that takes O(log(n)) time to add, erase, and search. Comparing with treap and red-black tree which has the same function and performance, the code length of Skiplist can be comparatively short and the idea behind Skiplists is just simple linked lists.

For example, we have a Skiplist containing [30,40,50,60,70,90] and we want to add 80 and 45 into it.

Implement the Skiplist class:

  • Skiplist() Initializes the object of the skiplist.
  • bool search(int target) Returns true if the integer target exists in the Skiplist or false otherwise.
  • void add(int num) Inserts the value num into the SkipList.
  • bool erase(int num) Removes the value num from the Skiplist and returns true. If num does not exist in the Skiplist, do nothing and return false. If there exist multiple num values, removing any one of them is fine.

Note that duplicates may exist in the Skiplist, your code needs to handle this situation.

Example:

Input
["Skiplist", "add", "add", "add", "search", "add", "search", "erase", "erase", "search"]
[[], [1], [2], [3], [0], [4], [1], [0], [1], [1]]
Output
[null, null, null, null, false, null, true, false, true, false]

Explanation
Skiplist skiplist = new Skiplist();
skiplist.add(1);
skiplist.add(2);
skiplist.add(3);
skiplist.search(0); // return False
skiplist.add(4);
skiplist.search(1); // return True
skiplist.erase(0);  // return False, 0 is not in skiplist.
skiplist.erase(1);  // return True
skiplist.search(1); // return False, 1 has already been erased.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible values that can be inserted into the SkipList?
  2. What is the maximum number of elements that the SkipList will be expected to hold?
  3. Are duplicate values allowed in the SkipList, and if so, how should they be handled by search, insertion, and deletion?
  4. What is the desired probability (p) to use for promoting nodes to higher levels? Should I assume a default value if not specified?
  5. If the search operation does not find the target value, what should be returned?

Brute Force Solution

Approach

Imagine a skiplist as a simple sorted list where we need to quickly find, add, or remove items. The most basic approach is to completely ignore any advanced data structures and just work with a regular sorted list. Every operation then involves going through the list to find the right spot.

Here's how the algorithm would work step-by-step:

  1. To add a new item, find its correct position in the sorted list by comparing it with each item one by one.
  2. Once the correct position is found, insert the new item into the list at that spot, shifting other items if necessary.
  3. To search for an item, go through the sorted list item by item and compare it with the item we are looking for.
  4. If a match is found, the search is successful. Otherwise, we've checked every item, and the item isn't in the list.
  5. To remove an item, go through the list item by item to find the item that needs to be removed.
  6. Once found, remove that item from the list, shifting the other items to fill the gap.

Code Implementation

class Skiplist:

    def __init__(self):
        self.sorted_list = []

    def search(self, target_value: int) -> bool:
        for element in self.sorted_list:
            if element == target_value:
                return True
        return False

    def add(self, num: int) -> None:
        # Find the correct position to insert the new number.
        insertion_index = 0
        while insertion_index < len(self.sorted_list) and self.sorted_list[insertion_index] < num:
            insertion_index += 1

        # Insert the new number at the found position.
        self.sorted_list.insert(insertion_index, num)

    def erase(self, num: int) -> bool:
        # Iterate through the list to find the element to remove.
        for index, element in enumerate(self.sorted_list):
            if element == num:
                # Remove the element if found.
                del self.sorted_list[index]
                return True

        # If the element is not found, return False.
        return False

Big(O) Analysis

Time Complexity
O(n)The skiplist operations (add, search, remove) are performed on a simple sorted list. Each operation involves iterating through the list to find the correct position or the element itself. In the worst-case scenario, we might need to traverse the entire list of n elements. Therefore, each operation takes O(n) time.
Space Complexity
O(1)The provided algorithm operates directly on a sorted list and does not create any auxiliary data structures like new lists, dictionaries, or sets. The only memory used is for variables to track the current position while searching, inserting, or deleting elements. These variables take up a constant amount of space, independent of the number of elements N in the sorted list. Therefore, the auxiliary space complexity is O(1).

Optimal Solution

Approach

Imagine creating express lanes on a road. This lets you skip over sections quickly. We'll use a similar idea to quickly find and modify elements in a sorted list, using multiple levels of 'express lanes'.

Here's how the algorithm would work step-by-step:

  1. Start with a regular sorted list of items, like numbers or words.
  2. Create a 'fast lane' above the main list, containing a smaller number of items from the main list, spaced out evenly. This lets you jump ahead faster.
  3. Make another 'even faster lane' above that, with even fewer items and bigger jumps.
  4. Continue creating these faster lanes, each with fewer items than the one below, until you reach a top lane with very few items for huge jumps.
  5. To find something, start at the top lane and jump as far as you can without going past the target item.
  6. When you can't jump any further in the current lane, drop down to the next slower lane and continue jumping.
  7. Repeat this process of jumping and dropping until you reach the bottom lane, where you can find the exact item you're looking for.
  8. To add or remove something, find the spot in the bottom list just like searching. Then, add or remove the element. After that, randomly decide if you want to add it to the levels above the bottom level.
  9. If you add to a higher level, repeat the random decision process. This keeps things balanced.

Code Implementation

import random

class SkiplistNode:
    def __init__(self, value, level):
        self.value = value
        self.forward = [None] * (level + 1)

class Skiplist:
    def __init__(self, max_level=16, probability=0.5):
        self.max_level = max_level
        self.probability = probability
        self.level = 0
        self.head = SkiplistNode(-1, max_level)

    def _random_level(self):
        level = 0
        while random.random() < self.probability and level < self.max_level:
            level += 1
        return level

    def search(self, target_value):
        node = self.head
        for i in range(self.level, -1, -1):
            while node.forward[i] and node.forward[i].value < target_value:
                node = node.forward[i]
        node = node.forward[0]
        return node is not None and node.value == target_value

    def add(self, number):
        update = [None] * (self.max_level + 1)
        node = self.head
        for i in range(self.level, -1, -1):
            while node.forward[i] and node.forward[i].value < number:
                node = node.forward[i]
            update[i] = node

        level = self._random_level()

        # Adjust the skiplist level if needed
        if level > self.level:
            for i in range(self.level + 1, level + 1):
                update[i] = self.head
            self.level = level

        new_node = SkiplistNode(number, level)

        # Update pointers for the new node
        for i in range(level + 1):
            new_node.forward[i] = update[i].forward[i]
            update[i].forward[i] = new_node

    def erase(self, number):
        update = [None] * (self.max_level + 1)
        node = self.head
        for i in range(self.level, -1, -1):
            while node.forward[i] and node.forward[i].value < number:
                node = node.forward[i]
            update[i] = node

        node = node.forward[0]

        # Only proceed if the node to delete exists
        if node is not None and node.value == number:
            # Update pointers to remove the node
            for i in range(self.level + 1):
                if update[i].forward[i] != node:
                    break
                update[i].forward[i] = node.forward[i]

            # Adjust skiplist level if topmost levels are empty
            while self.level > 0 and self.head.forward[self.level] is None:
                self.level -= 1

Big(O) Analysis

Time Complexity
O(log n)Searching in a Skip List involves traversing multiple levels, each acting as an express lane for the sorted data. At each level, we make jumps to the next node until we are about to overshoot the target. The number of levels is logarithmic with respect to the number of elements (n), roughly log(n). At each level, we perform a constant number of comparisons (at most) to determine the next jump. Therefore, the search, insertion, and deletion operations all have a time complexity of O(log n) on average, as the height of the skiplist is probabilistically balanced.
Space Complexity
O(N)The skiplist's space complexity is determined by the number of nodes across all its levels. In the worst-case scenario, each element in the sorted list (N elements) could potentially be promoted to higher levels, resulting in approximately O(N) nodes at each level and an average of log(N) levels. Therefore, the total space required to store all these nodes across all levels is proportional to N since the number of nodes in each level is related to the number of elements in the input data. This leads to a space complexity of O(N).

Edge Cases

CaseHow to Handle
Inserting the same element multiple timesThe skiplist should allow duplicate values, so multiple insertions of the same value should be handled correctly by inserting nodes in the appropriate positions.
Deleting a non-existent elementThe delete operation should gracefully handle the case where the element is not found in the skiplist, without throwing an error, and return false.
Maximum number of elements in skiplist approaching memory limitsThe skiplist should be designed to minimize memory footprint and be aware of potential memory constraints, potentially adjusting the probability to manage memory use.
Very large or very small values being inserted (Integer overflow or underflow during calculations)Use appropriate data types (e.g., long) and consider the potential for overflow during comparisons or level generation, potentially scaling down values.
Probability p is 0 or 1Handle these cases by throwing an exception or clamping p to a valid range (0 < p < 1).
Searching for element smaller than the minimum elementThe search operation should return null or a value indicating that the element is not found.
Deleting the minimum or maximum element in the skiplistEnsure the delete operation correctly updates the head pointers of the affected levels when removing the minimum or maximum element.
Concurrency: multiple threads inserting, deleting, or searching simultaneouslyIf multi-threading is considered, appropriate locking mechanisms or concurrent data structures should be used to ensure thread safety.