Taro Logo

Number of Connected Components in an Undirected Graph

Medium
Meta logo
Meta
2 views
Topics:
Graphs

You are given an undirected graph represented by n nodes (labeled from 0 to n - 1) and a list of edges, where each edge is a pair of node indices. The goal is to determine the number of connected components in the graph. A connected component is a subgraph in which any two vertices are connected to each other by a path.

Example 1:

n = 5
edges = [[0, 1], [1, 2], [3, 4]]

In this example, there are two connected components: {0, 1, 2} and {3, 4}. Therefore, the function should return 2.

Example 2:

n = 5
edges = [[0, 1], [1, 2], [2, 3], [3, 4]]

In this case, all nodes are connected, forming a single connected component. The function should return 1.

Example 3:

n = 6
edges = [[0, 1], [0, 2], [2, 5], [5, 4], [3, 3]]

Here, the connected components are {0, 1, 2, 4, 5} and {3}. The self-loop [3, 3] does not affect the number of components, so the function should return 2.

Constraints:

  • 1 <= n <= 2000
  • 0 <= edges.length <= 5000
  • edges[i].length == 2
  • 0 <= ai, bi < n
  • ai != bi

Write a function that takes the number of nodes n and the list of edges edges as input and returns the number of connected components in the graph. Discuss the time and space complexity of your solution. Can you think of multiple ways to solve this problem, and what are the trade-offs?

Solution


Number of Connected Components in an Undirected Graph

Problem Description

Given an undirected graph with n nodes and a list of edges, determine the number of connected components in the graph.

For example, consider a graph with 5 nodes (numbered 0 to 4) and the following edges:

edges = [[0, 1], [1, 2], [3, 4]]

In this case, there are two connected components: {0, 1, 2} and {3, 4}. Therefore, the expected output is 2.

Brute Force Solution: Depth-First Search (DFS)

A brute-force approach involves iterating through each node in the graph and performing a Depth-First Search (DFS) to explore its connected component. We can maintain a visited set to keep track of nodes we have already processed. For each unvisited node, we increment the component count and perform DFS to mark all nodes in its connected component as visited.

Algorithm:

  1. Initialize count = 0 and visited = set().
  2. Build the adjacency list representation of the graph from the given edges.
  3. Iterate through each node from 0 to n-1.
  4. If the node is not in visited:
    • Increment count.
    • Perform DFS starting from that node to mark all reachable nodes as visited.
  5. Return count.

Python Code:

def count_components_dfs(n, edges):
    adj_list = [[] for _ in range(n)]
    for u, v in edges:
        adj_list[u].append(v)
        adj_list[v].append(u)

    visited = set()
    count = 0

    def dfs(node):
        visited.add(node)
        for neighbor in adj_list[node]:
            if neighbor not in visited:
                dfs(neighbor)

    for i in range(n):
        if i not in visited:
            count += 1
            dfs(i)

    return count

Time and Space Complexity:

  • Time Complexity: O(V + E), where V is the number of vertices (nodes) and E is the number of edges. This is because we visit each node and edge at most once during the DFS traversal.
  • Space Complexity: O(V + E) because we store the graph in an adjacency list. In the worst case, the size of the adjacency list can be proportional to the number of edges. Also, the visited set and recursion stack in DFS can take up O(V) space.

Optimal Solution: Disjoint Set Union (DSU)

A more efficient approach to solve this problem is using the Disjoint Set Union (DSU) data structure, also known as Union-Find.

Algorithm:

  1. Initialize a DSU data structure with n nodes, where each node is initially in its own set.
  2. Iterate through each edge (u, v) in the edges list.
  3. Perform a union operation on u and v to merge their sets.
  4. After processing all edges, the number of connected components is equal to the number of disjoint sets remaining.

Python Code:

def count_components_dsu(n, edges):
    parent = list(range(n))
    rank = [0] * n

    def find(node):
        if parent[node] != node:
            parent[node] = find(parent[node])  # Path compression
        return parent[node]

    def union(node1, node2):
        root1 = find(node1)
        root2 = find(node2)
        if root1 != root2:
            if rank[root1] < rank[root2]:
                parent[root1] = root2
            elif rank[root1] > rank[root2]:
                parent[root2] = root1
            else:
                parent[root2] = root1
                rank[root1] += 1
            return 1  # Successfully unioned
        return 0  # Already in the same component

    count = n
    for u, v in edges:
        count -= union(u, v)

    return count

Time and Space Complexity:

  • Time Complexity: O(E * α(V)), where E is the number of edges, V is the number of vertices, and α(V) is the inverse Ackermann function, which grows extremely slowly. For practical purposes, α(V) can be considered to be constant (less than 5 for any conceivable input size). Therefore, the time complexity is nearly O(E).
  • Space Complexity: O(V), to store the parent and rank arrays in the DSU data structure.

Edge Cases

  • Empty Graph: If n = 0, the number of connected components is 0.
  • No Edges: If edges is empty, the number of connected components is n (each node is its own component).
  • Disconnected Graph: The general case of a graph with multiple connected components, as demonstrated in the examples.
  • Fully Connected Graph: If every node is connected to every other node, the number of connected components is 1.
  • Graph with Self-Loops: Self-loops do not affect the number of connected components. The algorithm should handle them correctly (DSU does inherently).
  • Duplicate Edges: Duplicate edges do not affect the number of connected components. The algorithm should handle them correctly (DSU does inherently).