Correct a Binary Tree

Medium

Google

1 view

Topics:

TreesGraphsRecursion

You are given the root of a binary tree that has been corrupted. In this corrupted binary tree, one node has two parents, while the rest of the tree's structure is valid. Your task is to identify the node with two parents and correct the tree by removing one of the incoming edges to this node, thus restoring the binary tree's original structure.

Clarifications:

Exactly one node has two parents.
All node values are unique.
The input is the root of the corrupted tree.
You should modify the tree in-place and return the root of the corrected tree.

Example:

Consider the following corrupted binary tree:

In this tree, node '3' has two parents: '1' and '5'. Your function should modify the tree to remove the edge '5 -> 3', resulting in the following corrected tree:

Or, alternatively, your function could remove the edge '1 -> 3', resulting in the following corrected tree:

Write a function to correct the binary tree by removing one of the duplicate parent-child relationships. You can choose arbitrarily which duplicate link to remove.

Correct a Binary Tree

Medium

Google

1 view

Topics:

TreesGraphsRecursion

Clarifications:

Exactly one node has two parents.
All node values are unique.
The input is the root of the corrupted tree.
You should modify the tree in-place and return the root of the corrected tree.

Example:

Consider the following corrupted binary tree:

In this tree, node '3' has two parents: '1' and '5'. Your function should modify the tree to remove the edge '5 -> 3', resulting in the following corrected tree:

Or, alternatively, your function could remove the edge '1 -> 3', resulting in the following corrected tree:

Write a function to correct the binary tree by removing one of the duplicate parent-child relationships. You can choose arbitrarily which duplicate link to remove.

Solution

Correct a Binary Tree

Naive Approach

A naive approach to correct a binary tree where exactly one node has two parents would involve traversing the tree to identify the problematic node and its duplicate parents, then severing the incorrect parent link.

Perform a Breadth-First Search (BFS) or Depth-First Search (DFS) to traverse the tree.
Keep track of visited nodes.
If a node is visited more than once, it indicates that this node has multiple parents.
Identify the incorrect parent.
Remove the edge from the incorrect parent to the node.

Example:

Consider a binary tree where node '3' has two parents: '2' and '4'. The naive approach would identify node '3' as having multiple parents and then determine whether the edge '2 -> 3' or '4 -> 3' should be removed based on some criteria (e.g., depth, value, or some other rule).

Code (Python):

from collections import deque

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def correct_binary_tree_naive(root):
    if not root: return None

    parent = {}
    q = deque([(root, None)])

    duplicate_node = None

    while q:
        node, par = q.popleft()

        if node in parent and parent[node] != par:
            duplicate_node = node
            break
        
        parent[node] = par

        if node.left:
            q.append((node.left, node))
        if node.right:
            q.append((node.right, node))

    # Find and remove the incorrect parent link
    q = deque([root])
    while q:
        node = q.popleft()
        if node.left == duplicate_node:
            node.left = None
            break
        if node.right == duplicate_node:
            node.right = None
            break
        if node.left:
            q.append(node.left)
        if node.right:
            q.append(node.right)

    return root

Optimal Approach

The key insight for an optimal solution is to realize that we can use a HashSet to detect the duplicate node during traversal and remove the incorrect edge on-the-fly.

Perform a Depth-First Search (DFS) traversal.
Use a HashSet to keep track of visited nodes.
If a node is found in the HashSet (implying a revisit), remove the connection from its parent.
The removal should occur before adding a node to the visited set.

Code (Python):

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def correct_binary_tree(root):
    visited = set()

    def dfs(node, parent):
        if not node: return None

        if node in visited:
            if parent.left == node:
                parent.left = None
            else:
                parent.right = None
            return

        visited.add(node)
        dfs(node.left, node)
        dfs(node.right, node)

    dfs(root, None)
    return root

Big O Analysis

Naive Approach:

Time Complexity: O(N), where N is the number of nodes in the tree. We traverse the tree twice in the worst case: once to find the duplicate node and once to remove the incorrect parent.
Space Complexity: O(N), primarily due to the queue used in BFS and the dictionary to store parents.

Optimal Approach:

Time Complexity: O(N), where N is the number of nodes in the tree. We visit each node at most once.
Space Complexity: O(N) in the worst-case scenario. This occurs when the tree is heavily skewed, and the recursion stack grows linearly with the number of nodes. The visited set also contributes O(N) in the worst case.

Edge Cases

Empty Tree: If the root is None, the function should return None immediately.
No Incorrect Node: The algorithm should still function correctly if there is no node with two parents (although this violates the problem constraints).
Root Node as Duplicate: Handle the case where the root itself is the duplicate node (though unlikely based on problem description), making sure the removal logic doesn't crash.
Large Trees: The solution should be efficient enough to handle large binary trees without running into memory or time limit issues. The O(N) complexities of both approaches are generally acceptable for most practical tree sizes.

Solution

Correct a Binary Tree

Naive Approach

Perform a Breadth-First Search (BFS) or Depth-First Search (DFS) to traverse the tree.
Keep track of visited nodes.
If a node is visited more than once, it indicates that this node has multiple parents.
Identify the incorrect parent.
Remove the edge from the incorrect parent to the node.

Example:

Code (Python):

from collections import deque

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def correct_binary_tree_naive(root):
    if not root: return None

    parent = {}
    q = deque([(root, None)])

    duplicate_node = None

    while q:
        node, par = q.popleft()

        if node in parent and parent[node] != par:
            duplicate_node = node
            break
        
        parent[node] = par

        if node.left:
            q.append((node.left, node))
        if node.right:
            q.append((node.right, node))

    # Find and remove the incorrect parent link
    q = deque([root])
    while q:
        node = q.popleft()
        if node.left == duplicate_node:
            node.left = None
            break
        if node.right == duplicate_node:
            node.right = None
            break
        if node.left:
            q.append(node.left)
        if node.right:
            q.append(node.right)

    return root

Optimal Approach

The key insight for an optimal solution is to realize that we can use a HashSet to detect the duplicate node during traversal and remove the incorrect edge on-the-fly.

Perform a Depth-First Search (DFS) traversal.
Use a HashSet to keep track of visited nodes.
If a node is found in the HashSet (implying a revisit), remove the connection from its parent.
The removal should occur before adding a node to the visited set.

Code (Python):

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def correct_binary_tree(root):
    visited = set()

    def dfs(node, parent):
        if not node: return None

        if node in visited:
            if parent.left == node:
                parent.left = None
            else:
                parent.right = None
            return

        visited.add(node)
        dfs(node.left, node)
        dfs(node.right, node)

    dfs(root, None)
    return root

Big O Analysis

Naive Approach:

Time Complexity: O(N), where N is the number of nodes in the tree. We traverse the tree twice in the worst case: once to find the duplicate node and once to remove the incorrect parent.
Space Complexity: O(N), primarily due to the queue used in BFS and the dictionary to store parents.

Optimal Approach:

Time Complexity: O(N), where N is the number of nodes in the tree. We visit each node at most once.
Space Complexity: O(N) in the worst-case scenario. This occurs when the tree is heavily skewed, and the recursion stack grows linearly with the number of nodes. The visited set also contributes O(N) in the worst case.

Edge Cases

Empty Tree: If the root is None, the function should return None immediately.
No Incorrect Node: The algorithm should still function correctly if there is no node with two parents (although this violates the problem constraints).
Root Node as Duplicate: Handle the case where the root itself is the duplicate node (though unlikely based on problem description), making sure the removal logic doesn't crash.
Large Trees: The solution should be efficient enough to handle large binary trees without running into memory or time limit issues. The O(N) complexities of both approaches are generally acceptable for most practical tree sizes.