Taro Logo

All Ancestors of a Node in a Directed Acyclic Graph

Medium
Google logo
Google
2 views
Topics:
GraphsDynamic Programming

You are given a positive integer n representing the number of nodes of a Directed Acyclic Graph (DAG). The nodes are numbered from 0 to n - 1 (inclusive).

You are also given a 2D integer array edges, where edges[i] = [from<sub>i</sub>, to<sub>i</sub>] denotes that there is a unidirectional edge from from<sub>i</sub> to to<sub>i</sub> in the graph.

Return a list answer, where answer[i] is the list of ancestors of the i<sup>th</sup> node, sorted in ascending order.

A node u is an ancestor of another node v if u can reach v via a set of edges.

Example 1:

n = 8, edgeList = [[0,3],[0,4],[1,3],[2,4],[2,7],[3,5],[3,6],[3,7],[4,6]] Output: [[],[],[],[0,1],[0,2],[0,1,3],[0,1,2,3,4],[0,1,2,3]]

Example 2:

n = 5, edgeList = [[0,1],[0,2],[0,3],[0,4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]] Output: [[],[0],[0,1],[0,1,2],[0,1,2,3]]

Could you provide an algorithm that efficiently computes the ancestors for each node in the given DAG? Consider the time and space complexity of your approach. How would you handle edge cases such as an empty graph, a disconnected graph, or a graph with a single node?

Solution


Naive Solution: Brute-Force Approach

For each node, we can perform a Depth-First Search (DFS) or Breadth-First Search (BFS) to find all the nodes that can reach it. These reachable nodes are the ancestors. We store these ancestors in a sorted list.

Algorithm:

  1. Initialize an empty list of lists called answer of size n. This list will store the ancestors for each node.
  2. Iterate through each node i from 0 to n-1.
  3. For each node i, perform a DFS or BFS starting from every other node j (from 0 to n-1).
  4. During the DFS/BFS, if we reach node i starting from node j, it means j is an ancestor of i.
  5. Add j to the list of ancestors for node i.
  6. After checking all possible starting nodes j, sort the list of ancestors for node i in ascending order.
  7. Append the sorted list to the answer list at index i.
  8. Return the answer list.

Code (Python):

def get_ancestors_brute_force(n, edges):
    graph = [[] for _ in range(n)]
    for u, v in edges:
        graph[u].append(v)
    
    ancestors = [[] for _ in range(n)]
    
    for i in range(n):
        for j in range(n):
            if i == j: continue
            visited = [False] * n
            stack = [j]
            visited[j] = True
            while stack:
                node = stack.pop()
                if node == i:
                    ancestors[i].append(j)
                    break
                for neighbor in graph[node]:
                    if not visited[neighbor]:
                        visited[neighbor] = True
                        stack.append(neighbor)
        ancestors[i].sort()
    return ancestors

Time Complexity: O(n^3) in the worst case. For each of the n nodes, we potentially perform a DFS/BFS from all other n nodes, and each DFS/BFS can take O(n) time.

Space Complexity: O(n^2) in the worst case. The graph representation can take O(E) space, where E is the number of edges, but in the worst case, the ancestor list for each node could contain up to n-1 elements, and we store this list for n nodes, leading to O(n^2) space.

Optimal Solution: Using Topological Sort and Bit Manipulation or Sets

We can improve the time complexity by using topological sort and sets to store ancestors.

Algorithm:

  1. Build the Graph and In-degree Array: Create an adjacency list representation of the graph and compute the in-degree of each node.
  2. Topological Sort: Perform a topological sort of the graph. This can be done using a queue. Start by adding all nodes with an in-degree of 0 to the queue. While the queue is not empty:
    • Dequeue a node u.
    • For each neighbor v of u:
      • Add all ancestors of u to the ancestors of v.
      • Add u to the ancestors of v.
      • Decrement the in-degree of v.
      • If the in-degree of v becomes 0, add v to the queue.
  3. Sort the Ancestor Sets: After the topological sort, sort the ancestor set for each node in ascending order.
  4. Return the Result: Return the list of sorted ancestor sets.

Code (Python):

from collections import deque

def get_ancestors_optimal(n, edges):
    graph = [[] for _ in range(n)]
    in_degree = [0] * n
    ancestors = [set() for _ in range(n)]

    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    queue = deque([i for i in range(n) if in_degree[i] == 0])

    while queue:
        u = queue.popleft()
        for v in graph[u]:
            ancestors[v].add(u)
            ancestors[v].update(ancestors[u])
            in_degree[v] -= 1
            if in_degree[v] == 0:
                queue.append(v)

    result = [sorted(list(ancestors[i])) for i in range(n)]
    return result

Time Complexity: O(n + E), where n is the number of nodes and E is the number of edges. Building the graph takes O(E) time. The topological sort visits each node and edge once, taking O(n + E) time. Sorting the ancestors lists takes O(n * k * log k) time, where k is the average number of ancestors per node. However, in most cases, n+E will dominate n * k * log k.

Space Complexity: O(n + E). The graph representation takes O(E) space. The in_degree array takes O(n) space. The ancestors sets take O(n^2) space in the worst case, but on average, they will take less space. The queue takes O(n) space in the worst case.

Edge Cases:

  • Empty Graph (n = 0): Should return an empty list.
  • No Edges: Each node has no ancestors, so the result should be a list of empty lists.
  • Single Node Graph: The result should be a list containing an empty list (since the node has no ancestors).
  • Disconnected Graph: The algorithm should correctly identify ancestors within each connected component.
  • Large Graph: The algorithm should be efficient enough to handle graphs with up to 1000 nodes and 2000 edges, as specified in the constraints.