Taro Logo

Count Unreachable Pairs of Nodes in an Undirected Graph

Medium
Asked by:
Profile picture
Profile picture
12 views
Topics:
Graphs

You are given an integer n. There is an undirected graph with n nodes, numbered from 0 to n - 1. You are given a 2D integer array edges where edges[i] = [ai, bi] denotes that there exists an undirected edge connecting nodes ai and bi.

Return the number of pairs of different nodes that are unreachable from each other.

Example 1:

Input: n = 3, edges = [[0,1],[0,2],[1,2]]
Output: 0
Explanation: There are no pairs of nodes that are unreachable from each other. Therefore, we return 0.

Example 2:

Input: n = 7, edges = [[0,2],[0,5],[2,4],[1,6],[5,4]]
Output: 14
Explanation: There are 14 pairs of nodes that are unreachable from each other:
[[0,1],[0,3],[0,6],[1,2],[1,3],[1,4],[1,5],[2,3],[2,6],[3,4],[3,5],[3,6],[4,6],[5,6]].
Therefore, we return 14.

Constraints:

  • 1 <= n <= 105
  • 0 <= edges.length <= 2 * 105
  • edges[i].length == 2
  • 0 <= ai, bi < n
  • ai != bi
  • There are no repeated edges.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the number of nodes 'n' and the number of edges? What is the maximum possible value?
  2. Are the node values labeled from 0 to n-1, or can they be any arbitrary integer?
  3. Can the input graph be disconnected, or is it guaranteed to be a single connected component to begin with?
  4. Can there be self-loops (an edge from a node to itself) or parallel edges (multiple edges between the same two nodes) in the input?
  5. If the graph is empty (no edges), should the output represent all possible pairs of unreachable nodes?

Brute Force Solution

Approach

The core idea is to check every possible pair of nodes in the graph. For each pair, we'll figure out if there's a path connecting them, and if there isn't, we'll count that pair as unreachable.

Here's how the algorithm would work step-by-step:

  1. Consider every possible combination of two nodes in the graph.
  2. For each pair of nodes, try to find a path between them.
  3. To find a path, start at one node and explore all its neighbors. Then explore the neighbors of those neighbors, and so on.
  4. Keep doing this until you either find the other node in the pair, or you've visited every node connected to the starting node.
  5. If you find a path between the two nodes, they are reachable, so don't count them.
  6. If you can't find a path after exploring every possible connection from the starting node, then the pair of nodes is unreachable. Count this pair.
  7. After checking all possible pairs of nodes, the final count of unreachable pairs is your answer.

Code Implementation

def count_unreachable_pairs(number_of_nodes, edges):
    graph = [[] for _ in range(number_of_nodes)]
    for node1, node2 in edges:
        graph[node1].append(node2)
        graph[node2].append(node1)

    unreachable_pairs_count = 0
    for first_node in range(number_of_nodes):
        for second_node in range(first_node + 1, number_of_nodes):

            # We start a breadth first search to find a path
            visited = [False] * number_of_nodes
            queue = [first_node]
            visited[first_node] = True

            while queue:
                current_node = queue.pop(0)

                if current_node == second_node:
                    break

                for neighbor in graph[current_node]:
                    if not visited[neighbor]:
                        queue.append(neighbor)
                        visited[neighbor] = True

            # If a path was not found, increment the counter
            if not visited[second_node]:
                unreachable_pairs_count += 1

    return unreachable_pairs_count

Big(O) Analysis

Time Complexity
O(n^3)The algorithm iterates through all possible pairs of nodes, which takes O(n^2) time where n is the number of nodes. For each pair of nodes, a pathfinding algorithm (likely Depth-First Search or Breadth-First Search based on the problem description) is executed to determine reachability. In the worst case, the pathfinding algorithm might visit all nodes and edges in the graph which would take O(n) time because we can assume number of edges is in the same order of magnitude of the number of nodes for a single component. Therefore, the overall time complexity becomes O(n^2 * n) = O(n^3).
Space Complexity
O(N)The algorithm uses a depth-first search (DFS) to find a path between node pairs. The DFS implementation implicitly utilizes a call stack that, in the worst case (e.g., a graph that is essentially a long chain), can grow to a depth of N, where N is the number of nodes in the graph. Additionally, to keep track of visited nodes during path finding, a boolean array of size N (visited array) is used to prevent cycles. Therefore, the space complexity is dominated by the recursion stack and the visited array, both of which require O(N) space.

Optimal Solution

Approach

The problem asks to count pairs of nodes that cannot reach each other in a graph. The key idea is to find groups of connected nodes and use the sizes of these groups to figure out unreachable pairs more efficiently than checking all pairs individually.

Here's how the algorithm would work step-by-step:

  1. First, identify all the separate groups of nodes that are connected to each other.
  2. For each node, find the group it belongs to. This tells us all the nodes it *can* reach.
  3. Count the size (number of nodes) of each group.
  4. Now, consider each group one by one. For each group, calculate how many nodes are *not* in that group. This is the number of nodes that the nodes in the current group cannot reach.
  5. Multiply the size of the current group by the number of nodes not in the group. This gives the number of unreachable pairs from the current group.
  6. Add up the number of unreachable pairs from all the groups. However, since each unreachable pair is counted twice (once for each node in the pair), divide the final result by two to get the correct number of unique unreachable node pairs.

Code Implementation

def count_unreachable_pairs(number_of_nodes, edges):
    graph = [[] for _ in range(number_of_nodes)]
    for node_a, node_b in edges:
        graph[node_a].append(node_b)
        graph[node_b].append(node_a)

    visited = [False] * number_of_nodes
    components = []

    def depth_first_search(node, current_component):
        visited[node] = True
        current_component.append(node)
        for neighbor in graph[node]:
            if not visited[neighbor]:
                depth_first_search(neighbor, current_component)

    # Find all connected components
    for node in range(number_of_nodes):
        if not visited[node]:
            current_component = []
            depth_first_search(node, current_component)
            components.append(current_component)

    component_sizes = [len(component) for component in components]
    total_unreachable_pairs = 0

    # Iterate through the sizes of connected components
    for component_size in component_sizes:
        # Count nodes not reachable from the current component.
        unreachable_nodes = number_of_nodes - component_size
        total_unreachable_pairs += component_size * unreachable_nodes

    # Each pair was counted twice so divide by 2
    return total_unreachable_pairs // 2

Big(O) Analysis

Time Complexity
O(n + m)The algorithm involves finding connected components using Depth First Search or Breadth First Search. Identifying connected components typically takes O(n + m) time, where n is the number of nodes and m is the number of edges. Counting the sizes of the groups is O(n). The subsequent calculations, involving iterating through groups and computing unreachable pairs, are also O(n), where n is the number of nodes. Since graph traversal (finding components) dominates, the overall time complexity is O(n + m). In a sparse graph where m is much smaller than n^2, this is more efficient than checking all possible pairs of nodes.
Space Complexity
O(N)The algorithm identifies connected components using a data structure like a Disjoint Set Union (DSU) or Depth-First Search (DFS), which requires space proportional to the number of nodes (N) to store parent pointers or visited status for each node. The sizes of these connected components are also stored, requiring another array of size N. Therefore, the dominant space complexity is determined by storing connected component information for each node, leading to O(N) auxiliary space.

Edge Cases

Empty graph (n = 0)
How to Handle:
Return 0, as there are no nodes and thus no pairs.
Single node graph (n = 1)
How to Handle:
Return 0, as there are no pairs of nodes.
Graph with no edges
How to Handle:
Each node is its own connected component; the answer is n * (n - 1) / 2.
Fully connected graph
How to Handle:
Return 0, as all nodes are reachable from each other.
Graph with a single connected component
How to Handle:
Return 0, as all nodes are reachable from each other.
Large graph (n close to limit) to check for integer overflow in calculations
How to Handle:
Use 64-bit integers (long) to store the number of nodes in each component and the total count of unreachable pairs.
Graph with many small connected components
How to Handle:
Ensure the algorithm efficiently iterates through all components and correctly calculates the unreachable pairs for each combination.
Disjoint graph with one very large component, and many very small components
How to Handle:
Ensure the algorithm doesn't time out calculating the large component's size using an efficient graph traversal method like Depth-First Search (DFS) or Breadth-First Search (BFS).