Taro Logo

Largest BST Subtree

Medium
TikTok logo
TikTok
0 views
Topics:
TreesRecursionDynamic Programming

Given a binary tree, find the size of the largest subtree which is a Binary Search Tree (BST).

A valid BST is defined as follows:

  • The left subtree of a node contains only nodes with keys less than the node's key.
  • The right subtree of a node contains only nodes with keys greater than the node's key.
  • Both the left and right subtrees must also be binary search trees.

Example 1:

Input: root = [10,5,15,1,8,null,7]Output: 3Explanation: The largest BST subtree is the one rooted at node 5. The nodes are 5, 1, and 8, giving a size of 3. The full tree is not a BST because node 7 in the right subtree of 15 is smaller than 10.

Example 2:

Input: root = [1,4,3,2]Output: 2Explanation:The largest BST subtree is the one rooted at node 3 with a left child of 2.

Example 3:

Input: root = [4,2,7,2,3,5,8]Output: 4Explanation:The largest BST subtree is the one rooted at node 7 with left child 5 and right child 8.

Constraints:

  • The number of nodes in the tree is in the range [0, 10⁴].
  • -10⁴ <= Node.val <= 10⁴

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the tree be empty? If so, what should I return?
  2. What is the range of values that the nodes in the tree can take? Are negative values possible?
  3. If multiple BST subtrees of the same maximum size exist, is any one of them acceptable?
  4. Are duplicate values allowed in the binary tree, and if so, how should they be handled when determining if a subtree is a valid BST?
  5. By 'largest', are we referring to the number of nodes in the subtree, or some other measure (like the sum of node values)?

Brute Force Solution

Approach

The brute force way to find the largest Binary Search Tree (BST) subtree involves checking every possible subtree within the larger tree. We consider each node as the potential root of a BST and verify if the subtree rooted at that node actually follows the BST properties. We keep track of the largest valid BST subtree found.

Here's how the algorithm would work step-by-step:

  1. Start with the root of the entire tree.
  2. Treat this root node as the potential root of a BST subtree and check if the entire tree rooted at this node is a valid BST.
  3. If it's a valid BST, remember its size (number of nodes).
  4. Now, move to the left child of the original root and repeat the same process: treat it as a potential BST root and check the subtree.
  5. Also, do the same for the right child of the original root.
  6. Continue this process, going deeper and deeper into the tree, checking every node as a potential BST root.
  7. For each potential BST, make sure that every node on the left is smaller than the root of that subtree and every node on the right is larger than the root.
  8. While checking all these subtrees, keep track of the size of the largest BST you have found so far.
  9. After checking all possible subtrees, the largest size you have remembered is the size of the largest BST subtree in the original tree.

Code Implementation

class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

def largest_bst_subtree_brute_force(root):

    def is_bst(node, minimum_value, maximum_value):
        if node is None:
            return True

        # Check BST property at current node
        if node.data <= minimum_value or node.data >= maximum_value:
            return False

        # Recursively check left and right subtrees
        return (is_bst(node.left, minimum_value, node.data) and\
                is_bst(node.right, node.data, maximum_value))

    def subtree_size(node):
        if node is None:
            return 0
        return 1 + subtree_size(node.left) + subtree_size(node.right)

    if root is None:
        return 0

    # Check if the subtree rooted at 'root' is a BST
    if is_bst(root, float('-inf'), float('inf')):
        return subtree_size(root)

    # Explore left and right subtrees to find larger BSTs
    left_size = largest_bst_subtree_brute_force(root.left)

    right_size = largest_bst_subtree_brute_force(root.right)

    # Return the maximum size among the subtrees
    return max(left_size, right_size)

Big(O) Analysis

Time Complexity
O(n^2)The algorithm considers each of the n nodes in the binary tree as a potential root of a BST subtree. For each of these n nodes, the `isValidBST` function (implicitly called when checking a subtree) could potentially visit all nodes within that subtree. In the worst-case scenario, a poorly balanced tree could mean each isValidBST check could touch up to n nodes. Therefore, we have n (nodes as potential roots) multiplied by a potential n (nodes to validate the BST rooted at that node) leading to O(n*n) or O(n^2) time complexity in the worst case.
Space Complexity
O(N)The brute force algorithm described uses recursion. In the worst case, the recursion may go as deep as the height of the tree. Since we are checking every possible subtree rooted at each node, in a skewed tree, this height can be N, where N is the number of nodes in the tree. Each recursive call adds a frame to the call stack. Therefore, the auxiliary space used by the recursion stack will be O(N) in the worst-case scenario due to the function call stack.

Optimal Solution

Approach

The most efficient way to solve this is to work from the bottom up. We check if each node is the root of a valid Binary Search Tree (BST) and keep track of the largest one we find so far. We do this without rechecking work, saving a lot of time.

Here's how the algorithm would work step-by-step:

  1. Start from the very bottom of the tree (the leaves).
  2. For each node, check if the subtree rooted at that node is a valid BST.
  3. To do this check, make sure the current node's value is greater than all values in its left subtree and smaller than all values in its right subtree.
  4. If it's a valid BST, also figure out the number of nodes in this BST.
  5. Keep track of the biggest BST you've found as you go. This means remembering the root of that BST and its size.
  6. As you move up the tree, you can reuse the information you've calculated from the children. If both children are valid BSTs, and the current node maintains the BST property, you know the whole subtree is a valid BST without recalculating everything.
  7. Finally, the largest BST you tracked along the way is the answer.

Code Implementation

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def largest_bst_subtree(root):
    largest_bst_size = 0
    largest_bst_node = None

    def is_bst(node, minimum_value, maximum_value):
        if not node:
            return True, 0

        is_bst_left, size_left = is_bst(node.left, minimum_value, node.val)
        is_bst_right, size_right = is_bst(node.right, node.val, maximum_value)

        # If any subtree is not a BST, the whole tree is not a BST
        if not is_bst_left or not is_bst_right or not (minimum_value <= node.val < maximum_value):
            return False, 0

        return True, size_left + size_right + 1

    def traverse(node):
        nonlocal largest_bst_size, largest_bst_node

        if not node:
            return

        traverse(node.left)
        traverse(node.right)

        is_bst_node, current_size = is_bst(node, float('-inf'), float('inf'))

        # Store the largest BST
        if is_bst_node and current_size > largest_bst_size:
            largest_bst_size = current_size
            largest_bst_node = node

    traverse(root)

    return largest_bst_size

Big(O) Analysis

Time Complexity
O(n)The algorithm visits each node in the tree once using a bottom-up approach. At each node, it performs a constant amount of work to check if the subtree rooted at that node is a valid BST by checking the BST property and the validity of its children, which were already computed. The size of the largest BST is updated during this single traversal. Therefore, the time complexity is directly proportional to the number of nodes, n, in the tree, resulting in O(n).
Space Complexity
O(H)The algorithm uses recursion to traverse the binary tree. The maximum depth of the recursion stack is determined by the height (H) of the tree. In the worst-case scenario, such as a skewed tree, the height H can be equal to N (number of nodes). Therefore, the maximum auxiliary space used by the call stack is proportional to the height of the tree, H. This excludes the space occupied by the tree itself, which is considered the input.

Edge Cases

CaseHow to Handle
Null root nodeReturn 0, as an empty tree has a BST size of 0.
Single node treeReturn 1, as a single node is a valid BST of size 1.
Skewed tree (left or right)The recursive calls should still traverse the entire tree, correctly identifying subtrees and comparing sizes, though the performance will degrade to O(n).
Tree with all identical valuesHandle this by treating identical values as violating the BST property during the BST validation step.
Tree with duplicate values that form valid BST subtreesThe BST validation check should correctly identify and count duplicate values if they satisfy the BST constraints of their respective subtrees.
Integer overflow in subtree size calculationsUse appropriate data types (e.g., long) to store subtree sizes to avoid potential integer overflow issues.
Very deep tree (potential stack overflow in recursive solutions)Consider iterative approaches or tail-call optimization (if the language supports it) to avoid stack overflow errors.
Root of the tree itself is a valid BST.The algorithm should correctly identify this case and return the total number of nodes in the tree.