Taro Logo

Closest Node to Path in Tree

Hard
Asked by:
Profile picture
3 views
Topics:
Graphs

You are given a tree (i.e. a connected, undirected graph that has no cycles) consisting of n nodes numbered from 0 to n - 1 and exactly n - 1 edges. The root of the tree is node 0, and each node of the tree has a distinct value in the range [1, n].

You are also given a 2D integer array queries of size k, where queries[i] = [nodei, vali].

For each query i, find the node that is closest to nodei and has the value vali. The answer to query i is the distance between the closest node and nodei. If multiple nodes have the same distance to nodei, choose the node with the smallest number. If no node has the value vali, the answer is -1.

Return an integer array of size k where answer[i] is the answer to the ith query.

Example 1:

Input: n = 7, edges = [[0,1],[0,2],[1,3],[1,4],[2,5],[2,6]], queries = [[3,4],[5,3],[2,4],[0,3]]
Output: [0,-1,1,2]
Explanation: The tree is shown above.
For the 1st query: The node of value 4 is 1. The distance between node 3 and node 1 is |3-1| = 2. The node of value 4 is 4. The distance between node 3 and node 4 is |3-4| = 1. The closest node of value 4 to node 3 is 4, so the answer to the 1st query is 1.
For the 2nd query: There is no node of value 3 in the tree, so the answer to the 2nd query is -1.
For the 3rd query: The node of value 4 is 1. The distance between node 2 and node 1 is |2-1| = 1. The node of value 4 is 4. The distance between node 2 and node 4 is |2-4| = 2. The closest node of value 4 to node 2 is 1, so the answer to the 3rd query is 1.
For the 4th query: There is no node of value 3 in the tree, so the answer to the 4th query is 2.

Example 2:

Input: n = 4, edges = [[0,1],[0,2],[2,3]], queries = [[0,3],[3,1],[2,3]]
Output: [3,-1,0]

Constraints:

  • 1 <= n <= 105
  • edges.length == n - 1
  • edges[i].length == 2
  • 0 <= ai, bi < n
  • queries[i].length == 2
  • 0 <= nodei < n
  • 1 <= vali <= n
  • The value of each node in the tree is distinct.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can you elaborate on the structure of the tree? Is it a binary tree, or can nodes have an arbitrary number of children?
  2. What defines the 'path'? Is it given as a list of node values, node objects, or something else?
  3. How is 'closest' defined? Is it based on distance in terms of the number of edges, or is there some other metric to consider?
  4. What is the range of values that the nodes can hold, and can they be negative or non-integer?
  5. If multiple nodes are equally 'close' to the path, which one should I return?

Brute Force Solution

Approach

The brute force method explores every possible node in the tree to find the closest one to the given path. We essentially look at each node individually and check how far away it is from the path. The node with the smallest distance is then selected.

Here's how the algorithm would work step-by-step:

  1. First, get all the nodes that are part of the given path in the tree. We need to know which nodes are considered to be on the 'target' path.
  2. Now, consider each node in the entire tree one at a time. For each of these 'test' nodes, we will calculate how far it is from the given path.
  3. To figure out the distance, find the shortest path from the current 'test' node to any node that is part of the 'target' path. This could involve tracing all possible routes from the 'test' node until we reach a node on the 'target' path.
  4. After calculating the distance from the current 'test' node to the path, store this distance. We also need to remember which node this distance belongs to.
  5. Repeat steps two through four for every single node in the tree. This ensures that we check every possible candidate.
  6. Once we have checked all the nodes, compare all the distances we stored. The node that has the smallest distance to the 'target' path is the closest node. If multiple nodes have the same minimum distance, we can pick any one of them.

Code Implementation

def closest_node_to_path_brute_force(tree, path, start_node): 
    path_nodes = set(path)
    min_distance = float('inf')
    closest_node = None

    # Iterate through each node in the tree
    for current_node in tree:
        # Calculate the distance to the closest node on the path
        current_min_distance = float('inf')
        for target_node in path_nodes:
            distance = find_distance(tree, current_node, target_node)
            current_min_distance = min(current_min_distance, distance)

        # Update the minimum distance and closest node if needed
        if current_min_distance < min_distance:

            #If we find a closer node update values
            min_distance = current_min_distance
            closest_node = current_node
        elif current_min_distance == min_distance:
            #if the same distance pick one node
            pass

    return closest_node

def find_distance(tree, start_node, end_node):
    if start_node == end_node:
        return 0

    visited = {start_node}
    queue = [(start_node, 0)]

    while queue:
        node, distance = queue.pop(0)
        for neighbor in get_neighbors(tree, node):
            if neighbor == end_node:
                return distance + 1
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append((neighbor, distance + 1))

    return float('inf')

def get_neighbors(tree, node):
    neighbors = []
    for start, end in tree:
        if start == node:
            neighbors.append(end)
        if end == node:
            neighbors.append(start)
    return neighbors

Big(O) Analysis

Time Complexity
O(n²)Finding all nodes in the path takes O(n) in the worst case, where n is the number of nodes in the tree. Then, for each of the n nodes in the tree, we calculate the shortest path to any node in the target path. Finding the shortest path in the worst case can take O(n). Thus, in the worst-case scenario, the nested iterations result in O(n * n) operations. This simplifies to O(n²).
Space Complexity
O(N)The algorithm stores all nodes of the path in a data structure, requiring space proportional to the path's length, which can be at most N where N is the total number of nodes in the tree. Furthermore, when calculating the shortest distance from each node to the path, it implicitly uses a queue (or stack if implemented recursively) for traversal, potentially visiting all nodes in the worst case. Therefore, the space complexity is dominated by storing the path nodes and the traversal data structure, resulting in O(N) auxiliary space.

Optimal Solution

Approach

The most efficient approach involves first finding the path in the tree and then searching for the closest node. We can find the path using Depth-First Search (DFS). After finding the path, we use Breadth-First Search (BFS) starting from all nodes on the path to find the closest node to the path efficiently.

Here's how the algorithm would work step-by-step:

  1. First, find the specific route through the tree that we're interested in. We can do this by exploring the tree, branch by branch, until we find that route.
  2. Now that we have the route, consider all the locations along that route as your starting points.
  3. Imagine ripples spreading out from each of those starting points simultaneously. The first location a ripple reaches is the closest location to that route.
  4. In other words, explore the tree level by level starting from the nodes on the path until we encounter a node not already visited. That node is the closest.
  5. If multiple nodes are equally close, any of them would be a valid answer. Return the first one found.

Code Implementation

from collections import deque

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def closest_node_to_path(root_node, start_node, end_node):
    path_nodes = find_path(root_node, start_node, end_node)

    if not path_nodes:
        return None

    queue = deque([(node, 0) for node in path_nodes])
    visited_nodes = set(path_nodes)

    # Begin BFS to find the closest node.
    while queue:
        current_node, distance = queue.popleft()

        if current_node.left and current_node.left not in visited_nodes:
            return current_node.left

        if current_node.right and current_node.right not in visited_nodes:
            return current_node.right

        if current_node.left:
            queue.append((current_node.left, distance + 1))
            visited_nodes.add(current_node.left)

        if current_node.right:
            queue.append((current_node.right, distance + 1))
            visited_nodes.add(current_node.right)

    return None

def find_path(root_node, start_node, end_node):
    path_nodes = []
    path_found = find_path_recursive(root_node, start_node, end_node, path_nodes)

    if path_found:
        return path_nodes
    else:
        return []

def find_path_recursive(current_node, start_node, end_node, path_nodes):
    if not current_node:
        return False

    path_nodes.append(current_node)

    if current_node == end_node:
        return True

    # Prioritize left subtree exploration.
    if current_node.left and find_path_recursive(current_node.left, start_node, end_node, path_nodes):
        return True

    # Then explore the right subtree.
    if current_node.right and find_path_recursive(current_node.right, start_node, end_node, path_nodes):
        return True

    # Backtrack if neither subtree leads to the end node.
    path_nodes.pop()
    return False

Big(O) Analysis

Time Complexity
O(n)Finding the path using Depth-First Search (DFS) in a tree takes O(n) time in the worst case, where n is the number of nodes in the tree, because DFS might visit every node. The Breadth-First Search (BFS) starts from the nodes of the path and expands outwards until it finds a node not on the path. In the worst-case scenario, BFS might visit all nodes of the tree also, but because in a tree the number of edges and nodes have a linear relationship, the BFS will at most visit all n nodes in O(n). Thus, the time complexity is bounded by O(n) + O(n), which simplifies to O(n).
Space Complexity
O(N)The Depth-First Search (DFS) to find the path can use a recursion stack that, in the worst-case scenario (e.g., a skewed tree), grows linearly with the number of nodes N. The Breadth-First Search (BFS) uses a queue to store nodes to visit, which, in the worst case, might contain all nodes in the level closest to the path, potentially also growing to a size proportional to N. We also need to keep track of visited nodes, which in the worst case can be all N nodes. Therefore, the auxiliary space complexity is O(N).

Edge Cases

CaseHow to Handle
Null or empty treeReturn an empty list or null, depending on specification, as there is no path or node to compare against.
Null or empty pathReturn an empty list or null, depending on specification, as there's nothing to search for proximity to.
Tree with only one nodeCompare the single node to the path and return it if the path exists, otherwise return an empty list or null.
Path with only one nodeFind the closest node in the tree to this single node in the path.
All nodes in the tree are identicalReturn the first encountered node during tree traversal if it's closest to any node in the path, otherwise return null or empty list.
No path exists between the specified start and end nodesIf no path exists, the path variable will be empty, and an empty list or null should be returned.
Integer overflow during distance calculationsUse appropriate data types (e.g., long) or modular arithmetic to prevent overflows during distance calculations.
Deeply unbalanced tree leading to stack overflow during recursionConvert the recursive approach to an iterative approach using a stack or queue to avoid stack overflow errors.