Taro Logo

Tree Diameter

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+1
More companies
Profile picture
46 views
Topics:
TreesGraphs

Given a tree (i.e., a connected, undirected graph that has no cycles) consisting of n nodes numbered from 0 to n - 1 and exactly n - 1 edges. The diameter of a tree is the length of the longest path between any two nodes in the tree.

Return the diameter of the tree.

Example 1:

Input: edges = [[0,1],[0,2]]
Output: 2
Explanation: 
A longest path of the tree is the path 1 - 0 - 2.

Example 2:

Input: edges = [[0,1],[1,2],[2,3],[1,4],[4,5]]
Output: 4
Explanation: 
A longest path of the tree is the path 3 - 2 - 1 - 4 - 5.

Constraints:

  • n == edges.length + 1
  • 1 <= n <= 104
  • 0 <= edges[i][0], edges[i][1] < n
  • There are no self-loops or repeated edges.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the format of the input? Will the tree be represented as an adjacency list, an array of edges, or another data structure?
  2. Can the tree be empty? What should I return if the tree is empty?
  3. Are the nodes in the tree weighted? If so, what are the possible values of the weights?
  4. Is the tree guaranteed to be connected? Or could there be multiple disconnected components?
  5. If there are multiple paths that constitute the diameter, is any one of them acceptable?

Brute Force Solution

Approach

To find the longest path in a tree using the brute force approach, we'll explore every possible path between every pair of nodes in the tree. This means checking each possible start and end point. We'll then see which path is the longest.

Here's how the algorithm would work step-by-step:

  1. Pick any two nodes in the tree.
  2. Find the path between these two nodes.
  3. Measure the length of this path (the number of connections/edges in the path).
  4. Repeat this process for every possible pair of nodes in the tree.
  5. Keep track of the longest path you've found so far.
  6. After checking all pairs of nodes, the longest path you've kept track of is the diameter of the tree.

Code Implementation

def tree_diameter_brute_force(graph):
    all_nodes = list(graph.keys())
    number_of_nodes = len(all_nodes)
    longest_path = 0

    for start_node_index in range(number_of_nodes):
        for end_node_index in range(start_node_index + 1, number_of_nodes):
            start_node = all_nodes[start_node_index]
            end_node = all_nodes[end_node_index]

            # Find the path between the two nodes.
            path = find_path(graph, start_node, end_node)

            # Measure the length of the path.
            if path:
                path_length = len(path) - 1

                # Keep track of the longest path.
                longest_path = max(longest_path, path_length)

    return longest_path

def find_path(graph, start_node, end_node):
    queue = [(start_node, [start_node])]
    visited = set()

    while queue:
        (current_node, path) = queue.pop(0)

        if current_node == end_node:
            return path

        visited.add(current_node)

        for neighbor in graph[current_node]:
            if neighbor not in visited:
                queue.append((neighbor, path + [neighbor]))

    return None

Big(O) Analysis

Time Complexity
O(n^3)The provided brute force approach involves iterating through all possible pairs of nodes in the tree, where n is the number of nodes. This already takes O(n^2) time. For each pair of nodes, we then need to find the path between them, which, in the worst case, could involve traversing most of the tree's edges and taking O(n) time. Therefore, the overall time complexity becomes O(n^2) * O(n) = O(n^3).
Space Complexity
O(N^2)The brute force approach, as described, calculates the path between every pair of nodes. To find each path, a depth-first search (DFS) or breadth-first search (BFS) is implicitly used. In the worst-case scenario, where the tree is a linked list, finding the path between two nodes may require storing a path of length N in the auxiliary space, where N is the number of nodes in the tree. Since we are doing this for all N * (N - 1) / 2 pairs of nodes, this can lead to storing paths of length N in the worst case for a majority of the node pairs, so the space complexity is O(N^2) because it requires storing information about a large number of paths between node pairs in the worst-case scenario (a tree that resembles a linked list). Therefore, the overall auxiliary space complexity is O(N^2).

Optimal Solution

Approach

The tree diameter problem aims to find the longest path between any two nodes in a tree. The optimal approach cleverly avoids checking every possible path by using a depth-first search strategy to find the farthest node from an arbitrary starting point, and then repeats the process from that farthest node.

Here's how the algorithm would work step-by-step:

  1. Pick any node in the tree and treat it as a starting point.
  2. Perform a search to find the node that is farthest away from the starting node.
  3. Now, take the node that was farthest away in the first search, and treat it as the new starting point.
  4. Perform another search to find the node that is farthest away from this new starting point.
  5. The path between these two farthest nodes that we've identified is the diameter of the tree.

Code Implementation

def tree_diameter(graph):
    def depth_first_search(start_node):
        visited = {}
        max_distance = 0
        farthest_node = start_node
        stack = [(start_node, 0)]
        visited[start_node] = True

        while stack:
            node, distance = stack.pop()

            if distance > max_distance:
                max_distance = distance
                farthest_node = node

            for neighbor in graph[node]:
                if neighbor not in visited:
                    visited[neighbor] = True
                    stack.append((neighbor, distance + 1))

        return farthest_node

    # Find the farthest node from an arbitrary starting node.
    arbitrary_node = next(iter(graph))
    farthest_node_one = depth_first_search(arbitrary_node)

    # Find the farthest node from the first farthest node.
    farthest_node_two = depth_first_search(farthest_node_one)

    # Calculating diameter using two dfs calls.
    visited = {}
    stack = [(farthest_node_one, 0)]
    visited[farthest_node_one] = True
    max_diameter = 0

    while stack:
        node, distance = stack.pop()

        if node == farthest_node_two:
            max_diameter = distance
            break

        for neighbor in graph[node]:
            if neighbor not in visited:
                visited[neighbor] = True
                stack.append((neighbor, distance + 1))

    return max_diameter

Big(O) Analysis

Time Complexity
O(n)The algorithm performs two depth-first searches (DFS). Each DFS explores all n nodes in the tree to find the farthest node from a given starting point. Since there are only two independent DFS traversals, the total time complexity is dominated by the sum of the cost of these two traversals, which is 2 * O(n). Therefore, the overall time complexity simplifies to O(n).
Space Complexity
O(N)The provided algorithm uses depth-first search (DFS) twice. DFS employs a recursion stack to keep track of the current path. In the worst-case scenario, the tree can be highly skewed, resembling a linked list, where each node has only one child. This leads to a maximum recursion depth of N, where N is the number of nodes in the tree. Thus, the space complexity is determined by the maximum depth of the recursion stack, which is O(N).

Edge Cases

Null or empty tree
How to Handle:
Return 0 as the diameter of an empty tree is defined as zero.
Tree with only one node
How to Handle:
Return 0 as the diameter of a single node tree is zero.
Tree with only two nodes
How to Handle:
Return 1 as the diameter is the single edge connecting the two nodes.
Skewed tree (all nodes in a single branch)
How to Handle:
The algorithm should correctly calculate the longest path through the skewed tree.
Complete binary tree
How to Handle:
Ensure the algorithm correctly handles balanced trees and calculates the diameter which goes through the root.
Large tree to check for stack overflow with recursion
How to Handle:
Consider iterative approach if the recursion depth becomes a bottleneck.
Tree with negative node values (although node values are irrelevant)
How to Handle:
The algorithm works correctly with negative node values as only the structure of the tree matters.
Integer overflow if the depth is too large.
How to Handle:
Use a data type with larger capacity (e.g. long) to store the diameter if required to prevent integer overflow.