Taro Logo

Merge BSTs to Create Single BST

Hard
Asked by:
Profile picture
5 views
Topics:
TreesGraphs

You are given n BST (binary search tree) root nodes for n separate BSTs stored in an array trees (0-indexed). Each BST in trees has at most 3 nodes, and no two roots have the same value. In one operation, you can:

  • Select two distinct indices i and j such that the value stored at one of the leaves of trees[i] is equal to the root value of trees[j].
  • Replace the leaf node in trees[i] with trees[j].
  • Remove trees[j] from trees.

Return the root of the resulting BST if it is possible to form a valid BST after performing n - 1 operations, or null if it is impossible to create a valid BST.

A BST (binary search tree) is a binary tree where each node satisfies the following property:

  • Every node in the node's left subtree has a value strictly less than the node's value.
  • Every node in the node's right subtree has a value strictly greater than the node's value.

A leaf is a node that has no children.

Example 1:

Input: trees = [[2,1],[3,2,5],[5,4]]
Output: [3,2,5,1,null,4]
Explanation:
In the first operation, pick i=1 and j=0, and merge trees[0] into trees[1].
Delete trees[0], so trees = [[3,2,5,1],[5,4]].

In the second operation, pick i=0 and j=1, and merge trees[1] into trees[0].
Delete trees[1], so trees = [[3,2,5,1,null,4]].

The resulting tree, shown above, is a valid BST, so return its root.

Example 2:

Input: trees = [[5,3,8],[3,2,6]]
Output: []
Explanation:
Pick i=0 and j=1 and merge trees[1] into trees[0].
Delete trees[1], so trees = [[5,3,8,2,6]].

The resulting tree is shown above. This is the only valid operation that can be performed, but the resulting tree is not a valid BST, so return null.

Example 3:

Input: trees = [[5,4],[3]]
Output: []
Explanation: It is impossible to perform any operations.

Constraints:

  • n == trees.length
  • 1 <= n <= 5 * 104
  • The number of nodes in each tree is in the range [1, 3].
  • Each node in the input may have children but no grandchildren.
  • No two roots of trees have the same value.
  • All the trees in the input are valid BSTs.
  • 1 <= TreeNode.val <= 5 * 104.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the number of BSTs in the input list, and the number of nodes in each BST?
  2. What are the data types of the node values in the BSTs, and what are the min/max values of the allowed range?
  3. If it's impossible to merge the BSTs into a valid BST within the given range, should I return null, or is there a specific error code or exception I should raise?
  4. Is the given range inclusive or exclusive?
  5. Are the BSTs guaranteed to be valid binary search trees initially, or do I need to validate them before merging?

Brute Force Solution

Approach

The brute force method merges BSTs by trying every possible combination of nodes. It involves extracting all nodes from the given BSTs, shuffling them in every possible way, and then attempting to construct a single valid BST from each shuffled order.

Here's how the algorithm would work step-by-step:

  1. First, gather all the individual pieces (nodes) from all of the binary search trees.
  2. Next, imagine these pieces are like cards, and we need to arrange them in every single possible order. This means trying every combination imaginable.
  3. For each of these possible orders, try to build a new binary search tree using those pieces.
  4. As we're building each tree, check that it follows the rules of a binary search tree: smaller values on the left, larger values on the right. If it breaks these rules at any point, discard that particular order and move on.
  5. If, after putting all the pieces together in a specific order, we successfully create a valid binary search tree, remember that tree.
  6. Once we've tried all possible orders, we'll have a collection of potentially valid binary search trees. We then pick the 'best' one. In this scenario we'd stop after the first valid BST is found.

Code Implementation

class Node:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def merge_bsts_brute_force(root1, root2):

    all_values = []

    def inorder_traversal(root):
        if root:
            inorder_traversal(root.left)
            all_values.append(root.value)
            inorder_traversal(root.right)

    inorder_traversal(root1)
    inorder_traversal(root2)

    import itertools
    for permutation in itertools.permutations(all_values):
        
        # Attempt to construct BST from each permutation.
        root = construct_bst(permutation)

        if is_valid_bst(root):
            return root

    return None

def construct_bst(values):
    if not values:
        return None

    root = Node(values[0])

    for value in values[1:]:
        insert_node(root, value)

    return root

def insert_node(root, value):
    if value < root.value:
        if root.left is None:
            root.left = Node(value)
        else:
            insert_node(root.left, value)
    else:
        if root.right is None:
            root.right = Node(value)
        else:
            insert_node(root.right, value)

def is_valid_bst(root):
    def is_bst_helper(node, minimum_value, maximum_value):
        if not node:
            return True

        if node.value <= minimum_value or node.value >= maximum_value:
            return False

        return (is_bst_helper(node.left, minimum_value, node.value) and \
                is_bst_helper(node.right, node.value, maximum_value))

    # Use helper to define initial min/max boundaries.
    return is_bst_helper(root, float('-inf'), float('inf'))

Big(O) Analysis

Time Complexity
O(n! * n)The brute force approach first extracts all n nodes from the BSTs. Then, it generates all possible permutations of these nodes, which takes O(n!) time. For each permutation, it attempts to construct a BST, which involves inserting each node into the tree. Each insertion takes O(h) time in the best case for a balanced tree. In the worst case (skewed tree) h will be n. Because the creation of the BST could degenerate into a skewed tree, construction takes O(n). Therefore, building the tree for a single permutation costs O(n), and the overall time complexity becomes O(n! * n).
Space Complexity
O(N)The algorithm extracts all N nodes from the given BSTs into a temporary storage. Generating all possible permutations of these N nodes may require additional space depending on the implementation but at minimum we must store all nodes in a list, array, or similar data structure. Building a BST from each permutation would require space proportional to the number of nodes, worst case N, so we must account for the original copy of N nodes and one BST of up to N nodes. Therefore, the auxiliary space complexity is O(N).

Optimal Solution

Approach

The best way to merge Binary Search Trees (BSTs) into a single BST is to leverage their ordered property. First, we extract all the node values from the BSTs, then we rebuild a balanced BST using those values. This is faster than directly merging the tree structures.

Here's how the algorithm would work step-by-step:

  1. Gather all the values stored in each individual BST into a single, sorted list. Because the nodes of a BST are already in order, getting a sorted list is relatively simple.
  2. Sort the combined list of values to ensure everything is in ascending order. This is crucial for creating a valid BST.
  3. Use this sorted list to construct a new, balanced BST. A balanced BST ensures the fastest possible search times.
  4. The key is to pick the middle element of the sorted list as the root of your BST. Then, recursively build the left subtree with the values before the middle element and the right subtree with the values after the middle element.
  5. Keep dividing the sorted list in half and making the middle element the root of a (sub)tree until you run out of elements. This guarantees the new BST is properly structured and balanced.

Code Implementation

class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

def inorder_traversal(root, sorted_list):
    if root:
        inorder_traversal(root.left, sorted_list)
        sorted_list.append(root.data)
        inorder_traversal(root.right, sorted_list)

def sorted_list_to_bst(sorted_list):
    if not sorted_list:
        return None

    middle_index = len(sorted_list) // 2
    root = Node(sorted_list[middle_index])

    # Recursively construct left and right subtrees
    root.left = sorted_list_to_bst(sorted_list[:middle_index])

    root.right = sorted_list_to_bst(sorted_list[middle_index + 1:])

    return root

def merge_bsts(root1, root2):
    # Extract inorder traversal, resulting in sorted lists
    sorted_list_one = []
    inorder_traversal(root1, sorted_list_one)

    sorted_list_two = []
    inorder_traversal(root2, sorted_list_two)

    # Merge the two sorted lists into a single sorted list
    merged_sorted_list = sorted(sorted_list_one + sorted_list_two)

    # Build a balanced BST from the merged sorted list
    return sorted_list_to_bst(merged_sorted_list)

Big(O) Analysis

Time Complexity
O(n log n)First, extracting the n node values from the BSTs takes O(n) time in total, assuming n is the total number of nodes in both trees. Then, sorting the combined list of n values takes O(n log n) time using an efficient sorting algorithm like merge sort or quicksort. Finally, constructing a balanced BST from the sorted list takes O(n) time, as each node is visited and processed exactly once during the recursive construction process. Therefore, the overall time complexity is dominated by the sorting step, resulting in O(n log n).
Space Complexity
O(N)The algorithm's space complexity is dominated by the auxiliary space used to store the node values from the BSTs and to construct the new BST. Specifically, the sorted list of all node values requires O(N) space, where N is the total number of nodes across all input BSTs. The construction of the new BST also requires O(N) space to store all of the nodes. The recursive calls to build the balanced BST have a maximum depth of log(N), contributing O(logN) to the call stack space, but this is asymptotically smaller than O(N), so the overall space complexity remains O(N).

Edge Cases

Null or empty list of BSTs
How to Handle:
Return null immediately, as no BST can be created without input trees.
A single BST is provided, and it falls within the given range
How to Handle:
Return the root of that single BST directly if its values fall within the range.
The given range is invalid (min > max)
How to Handle:
Return null, as no BST can possibly fall within an invalid range.
The combined range of values in all BSTs is outside the given range
How to Handle:
Return null, as it is impossible to create a BST with values falling in the required range.
BSTs with overlapping ranges of values but no duplicates (e.g., [1, 5] and [3, 7])
How to Handle:
The in-order traversal and merge operation should handle overlapping ranges correctly, but validity checks must ensure proper BST structure.
Extreme boundary values for node data (Integer.MAX_VALUE, Integer.MIN_VALUE)
How to Handle:
Ensure that comparisons and tree construction do not result in integer overflows or other unexpected behavior; use long data type when required.
No valid solution exists (BSTs are incompatible and cannot be merged into a single valid BST within range)
How to Handle:
After merging, if the resulting tree is not a valid BST or if it falls outside the specified range, return null.
Very large number of BSTs leading to potential memory issues during merging
How to Handle:
Consider a divide-and-conquer approach, merging BSTs in batches to manage memory efficiently.