Taro Logo

Balance a Binary Search Tree

Medium
Google logo
Google
1 view
Topics:
TreesRecursion

Given the root of a binary search tree, return a balanced binary search tree with the same node values. If there is more than one answer, return any of them.

A binary search tree is balanced if the depth of the two subtrees of every node never differs by more than 1.

For example:

Input: [1,null,2,null,3,null,4,null,null] Output: [2,1,3,null,null,null,4] Explanation: This is not the only correct answer, [3,1,4,null,2] is also correct.

Input: [2,1,3] Output: [2,1,3]

Constraints:

  • The number of nodes in the tree is in the range [1, 10^4].Node.val <= 10^5

How would you approach this problem? Consider the efficiency of your solution in terms of both time and space complexity. Are there any edge cases that need to be considered?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the tree contain duplicate values, and if so, how should they be handled during balancing?
  2. What data type are the node values, and are there any restrictions on the range of possible values?
  3. If the input tree is already balanced, should I return the original tree or still perform a balancing operation?
  4. Is there a specific balancing algorithm (e.g., AVL, Red-Black) I should prioritize, or am I free to choose any valid approach?
  5. Should the balanced tree be a perfect binary search tree, or is an approximately balanced tree acceptable?

Brute Force Solution

Approach

The brute force method for balancing a binary search tree involves exploring every possible tree structure. We essentially try out every arrangement of the tree's elements. We then check if each arrangement is both a valid binary search tree and if it's balanced.

Here's how the algorithm would work step-by-step:

  1. First, gather all the values that are stored in the tree.
  2. Then, consider every possible way to arrange these values into a tree structure.
  3. For each possible tree structure, check if it follows the rules of a binary search tree: values on the left side must be smaller than the current value, and values on the right side must be larger.
  4. If it is a valid binary search tree, then check how balanced it is. We can measure 'balanced' by looking at how different the heights of the left and right sides of the tree are.
  5. Remember the most balanced tree structure we've found so far.
  6. After checking all possible tree structures, the one we remembered as being the most balanced is our answer.

Code Implementation

class Node:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def generate_trees(values):
    if not values:
        return [None]
    
    trees = []
    for i in range(len(values)):
        root_value = values[i]
        left_values = values[:i]
        right_values = values[i+1:]
        
        left_subtrees = generate_trees(left_values)
        right_subtrees = generate_trees(right_values)
        
        for left_subtree in left_subtrees:
            for right_subtree in right_subtrees:
                root = Node(root_value)
                root.left = left_subtree
                root.right = right_subtree
                trees.append(root)
                
    return trees

def is_bst(root):
    def is_bst_helper(node, min_value, max_value):
        if not node:
            return True
        
        if (min_value is not None and node.value <= min_value) or \
           (max_value is not None and node.value >= max_value):
            return False
        
        return (is_bst_helper(node.left, min_value, node.value) and \
                is_bst_helper(node.right, node.value, max_value))
                
    return is_bst_helper(root, None, None)

def get_height(root):
    if not root:
        return 0
    return 1 + max(get_height(root.left), get_height(root.right))

def is_balanced(root):
    if not root:
        return True
    
    left_height = get_height(root.left)
    right_height = get_height(root.right)
    
    # Check if heights of left and right subtrees differ by more than 1
    if abs(left_height - right_height) > 1:
        return False

    return is_balanced(root.left) and is_balanced(root.right)

def balance_bst_brute_force(root):
    if not root:
        return None

    # Extract all node values from the tree
    node_values = []
    def inorder_traversal(node):
        if node:
            inorder_traversal(node.left)
            node_values.append(node.value)
            inorder_traversal(node.right)
    inorder_traversal(root)

    # Generate all possible BSTs from the values
    possible_trees = generate_trees(node_values)

    best_tree = None
    best_balance_factor = float('inf')

    for tree in possible_trees:
        # Check if the tree is a valid BST
        if is_bst(tree):
            # Check if the tree is balanced
            if is_balanced(tree):

                left_height = get_height(tree.left)
                right_height = get_height(tree.right)
                balance_factor = abs(left_height - right_height)

                # Find a more balanced BST
                if balance_factor < best_balance_factor:
                    best_balance_factor = balance_factor
                    best_tree = tree

    # If no valid BST found, return original root
    if best_tree is None:
        return root
    return best_tree

Big(O) Analysis

Time Complexity
O(n! * n)The brute force approach explores all possible tree structures. Generating all permutations of n elements takes O(n!) time. For each permutation, we need to construct a binary search tree, which involves inserting n nodes. In the worst case, each insertion might take O(1) time, resulting in O(n) time for constructing the tree. Then, we need to validate the BST property and check its balance, which requires traversing the tree, taking O(n) time. Therefore, the overall time complexity is O(n! * n).
Space Complexity
O(N!)The algorithm first gathers all N values from the tree into a list, which takes O(N) space. The core of the algorithm involves considering every possible arrangement of these N values into a tree structure. Generating all permutations of N elements requires storing N! different arrangements. While the explanation doesn't explicitly state *storing* every arrangement simultaneously, the sheer scale of generating and checking N! structures dominates the space complexity. Therefore, the auxiliary space is dictated by the potential number of tree arrangements considered, leading to a space complexity of O(N!).

Optimal Solution

Approach

The best way to balance a binary search tree is to first extract all the elements in a sorted order. Then, using this sorted list, reconstruct a perfectly balanced tree, ensuring optimal height and distribution of elements.

Here's how the algorithm would work step-by-step:

  1. First, imagine taking all the values out of the tree and arranging them in order from smallest to largest.
  2. Next, think of rebuilding the tree from scratch using this sorted list of values.
  3. To make sure the tree is balanced, we'll pick the middle value from the sorted list to be the root of the tree.
  4. Then, we'll split the remaining sorted values into two groups: those smaller than the root (for the left side of the tree) and those larger than the root (for the right side).
  5. We'll repeat the process of picking the middle value for each of these smaller groups to create the left and right subtrees.
  6. Continue doing this until we've used all the values from the original sorted list to build the balanced tree.

Code Implementation

class TreeNode:
    def __init__(self, value=0, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

def balance_bst(root):
    def inorder_traversal(node, sorted_list):
        if not node:
            return
        inorder_traversal(node.left, sorted_list)
        sorted_list.append(node.value)
        inorder_traversal(node.right, sorted_list)

    sorted_nodes = []
    inorder_traversal(root, sorted_nodes)

    # Build balanced BST from sorted list.
    def build_balanced_tree(sorted_list):
        if not sorted_list:
            return None

        middle_index = len(sorted_list) // 2

        # Root is middle element to ensure balance.
        root_node = TreeNode(sorted_list[middle_index])
        
        root_node.left = build_balanced_tree(sorted_list[:middle_index])
        # Right subtree contains elements greater than root
        root_node.right = build_balanced_tree(sorted_list[middle_index + 1:])
        return root_node

    return build_balanced_tree(sorted_nodes)

Big(O) Analysis

Time Complexity
O(n)The algorithm consists of two main phases: in-order traversal to extract sorted elements, and constructing the balanced tree from the sorted list. The in-order traversal visits each of the n nodes in the tree exactly once, contributing O(n). Building the balanced tree recursively also processes each of the n elements in the sorted array once. Specifically, each element is considered as the root of a subtree exactly once, thus this phase also contributes O(n). The overall complexity is therefore O(n) + O(n), which simplifies to O(n).
Space Complexity
O(N)The algorithm first extracts all N nodes from the binary search tree into a sorted list. This sorted list requires O(N) auxiliary space. While rebuilding the tree using recursion, the maximum depth of the recursion call stack can reach N in the worst-case (for a highly unbalanced tree). Therefore, considering both the sorted list and the recursive calls, the auxiliary space complexity is dominated by the sorted list's size, leading to O(N).

Edge Cases

CaseHow to Handle
Null or Empty TreeReturn null immediately as there's nothing to balance.
Single Node TreeReturn the original node as it's already balanced.
Tree with only left children (skewed left)The balancing algorithm should correctly re-arrange the nodes to reduce height.
Tree with only right children (skewed right)The balancing algorithm must rearrange nodes to distribute them evenly and balance the tree.
Tree with duplicate valuesEnsure balancing algorithm (e.g., AVL/Red-Black) handles duplicate keys correctly without violating BST properties, potentially by adjusting insertion logic.
Very large tree (approaching memory limits)Consider iterative inorder traversal to avoid stack overflow, and verify algorithm scales efficiently in terms of both time and memory.
Tree with negative and positive valuesThe balancing algorithm should handle both positive and negative values correctly maintaining the BST property during balancing.
Integer overflow during value calculations (if applicable for balancing criteria)Use appropriate data types or checks to prevent integer overflows when calculating heights or differences.