Taro Logo

Find Eventual Safe States

Medium
Google logo
Google
2 views
Topics:
GraphsDynamic Programming

There is a directed graph of n nodes with each node labeled from 0 to n - 1. The graph is represented by a 0-indexed 2D integer array graph where graph[i] is an integer array of nodes adjacent to node i, meaning there is an edge from node i to each node in graph[i].

A node is a terminal node if there are no outgoing edges. A node is a safe node if every possible path starting from that node leads to a terminal node (or another safe node).

Return an array containing all the safe nodes of the graph. The answer should be sorted in ascending order.

Example 1:

graph = [[1,2],[2,3],[5],[0],[5],[],[]]

Output: [2,4,5,6]

Example 2:

graph = [[1,2,3,4],[1,2],[3,4],[0,4],[]]

Output: [4]

Could you provide a solution to identify and return all the safe nodes in a given directed graph?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for the nodes in the graph, and what is the maximum number of nodes?
  2. Can there be self-loops (an edge from a node to itself) or multiple edges between two nodes?
  3. If a node has no outgoing edges, is it considered a safe node?
  4. Is the graph guaranteed to be a directed acyclic graph (DAG), or could there be cycles?
  5. What should I return if the input graph is empty or null?

Brute Force Solution

Approach

The goal is to find which locations are 'safe', meaning there's no way to get stuck in a cycle. The brute force approach is basically to explore every possible path from each location to see where it leads.

Here's how the algorithm would work step-by-step:

  1. Pick a starting location.
  2. Follow one of the paths leading from that location.
  3. Keep following the path. If you ever come back to a location you've already visited on *this* path, you're in a cycle, so this starting location is unsafe.
  4. If the path eventually leads to a location with *no* outgoing paths, that's a safe end point. The original starting location is safe too.
  5. If there's another unexplored path from your starting location, try that one instead.
  6. Do this for every path from every starting location. If, no matter which path you take, a starting location only leads to safe end points, it's a safe location.
  7. Repeat the whole process for every possible starting location, marking safe and unsafe locations along the way.
  8. In the end, report all the locations we found that are 'safe'.

Code Implementation

def find_eventual_safe_states_brute_force(graph):
    number_of_nodes = len(graph)
    safe_nodes = []

    for start_node in range(number_of_nodes):
        if is_node_safe(graph, start_node, set()):
            safe_nodes.append(start_node)

    return sorted(safe_nodes)

def is_node_safe(graph, current_node, visited_nodes):
    if current_node in visited_nodes:
        return False

    if not graph[current_node]:
        return True

    visited_nodes.add(current_node)

    # Iterate through all outgoing edges
    for neighbor_node in graph[current_node]:
        # If the neighbor is not safe, current is also not safe
        if not is_node_safe(graph, neighbor_node, visited_nodes.copy()):
            return False

    # If all outgoing edges are safe, current is also safe
    return True

Big(O) Analysis

Time Complexity
O(n * 2^n)For each of the n nodes, we explore every possible path. In the worst case, a graph can have exponential number of paths. Specifically, in a graph where each node points to every other node, exploring all paths from one node can lead to traversing nearly every node multiple times, resulting in a runtime proportional to 2^n for each starting node. Since we repeat this process for all n starting nodes, the total time complexity becomes O(n * 2^n). This happens because each node might lead to an exponential number of paths before determining whether it is safe or not.
Space Complexity
O(N)The algorithm uses recursion to explore paths from each location. In the worst-case scenario, the recursion depth could reach N, where N is the number of nodes in the graph, leading to N stack frames. Additionally, to detect cycles, the algorithm keeps track of visited locations along the current path. In the worst case, this 'visited' set could also contain up to N nodes. Therefore, the auxiliary space complexity is O(N) due to the recursion stack and the visited set.

Optimal Solution

Approach

The core idea is to find the 'bad' nodes first: the ones that will definitely lead to an endless loop. Then, we mark all nodes that can reach these 'bad' nodes as also 'bad'. Finally, any node not marked as 'bad' is 'safe'.

Here's how the algorithm would work step-by-step:

  1. Think of the connections between nodes as roads. Some roads lead to a dead end (safe node), and some lead to a cycle (unsafe node).
  2. First, find all nodes that are part of a cycle. Any node in a cycle is inherently unsafe because you can keep going around and around forever.
  3. Next, consider nodes that can reach these cycle nodes. If you can get to a cycle, you are also unsafe because you might get stuck in it.
  4. Keep checking for nodes that can reach the unsafe nodes until you can't find any more. This is like tracking the 'unsafe' influence.
  5. Finally, any node that hasn't been labeled as unsafe is safe. These are the nodes that only lead to dead ends, not cycles.

Code Implementation

def eventualSafeNodes(graph):
    number_of_nodes = len(graph)
    node_states = [0] * number_of_nodes  # 0: unvisited, 1: visiting, 2: safe

    safe_nodes = []

    def depthFirstSearch(node):
        if node_states[node] == 1:
            return False  # Cycle detected

        if node_states[node] == 2:
            return True  # Already marked as safe

        node_states[node] = 1  # Mark as currently visiting

        for neighbor in graph[node]:
            if not depthFirstSearch(neighbor):
                return False

        node_states[node] = 2  # Mark as safe after visiting all neighbors
        return True

    # Iterate through all nodes to find safe nodes.
    for node in range(number_of_nodes):
        if depthFirstSearch(node):
            safe_nodes.append(node)

    # Return a sorted list of safe nodes.
    return sorted(safe_nodes)

Big(O) Analysis

Time Complexity
O(n + e)The algorithm first identifies nodes in cycles using a Depth-First Search (DFS) approach. In the worst case, the DFS visits all nodes (n) and edges (e) of the graph. Subsequently, it iterates to find nodes that can reach these cycle nodes. This process may also involve traversing a significant portion of the graph, potentially visiting nodes and edges again. Therefore, the overall time complexity is dominated by the graph traversal, which is O(n + e), where n represents the number of nodes and e represents the number of edges.
Space Complexity
O(N)The algorithm uses auxiliary space to keep track of 'bad' (unsafe) nodes, which can be stored in a boolean array or similar data structure of size N, where N is the number of nodes in the graph. Furthermore, during the cycle detection and reachability analysis, a recursion stack (in DFS implementations) or a queue (in BFS implementations) might grow up to the size of N in the worst-case scenario (e.g., when the entire graph is a single path or needs to be visited). Thus, the algorithm requires O(N) space to store information about each node and track the traversal. Therefore, the overall auxiliary space complexity is O(N).

Edge Cases

CaseHow to Handle
Empty graph (no nodes)Return an empty list as there are no states to evaluate.
Graph with only one node and no outgoing edgesThe single node is a safe state, add it to the result.
Graph with a single node and a self-loopThis node is not a safe state and should not be included in the result.
Graph with a cycle of nodes that cannot reach a terminal nodeThese nodes are not safe and the algorithm must correctly identify cycles.
Graph where all nodes are safe statesThe algorithm should return all nodes in sorted order.
Graph where no nodes are safe states (all part of cycles leading to non-terminal nodes)The algorithm should return an empty list.
Graph with a large number of nodes and edges, potentially leading to stack overflow with recursive DFS implementationsImplement iteratively with explicit stack, or ensure recursion depth is managed to prevent overflow.
Graph contains nodes with very high out-degreesThe solution should efficiently iterate over neighbors to determine safety.