Taro Logo

Longest Cycle in a Graph

Medium
23 days ago

You are given a directed graph of n nodes numbered from 0 to n - 1, where each node has at most one outgoing edge. The graph is represented with a given 0-indexed array edges of size n, indicating that there is a directed edge from node i to node edges[i]. If there is no outgoing edge from node i, then edges[i] == -1. Return the length of the longest cycle in the graph. If no cycle exists, return -1. A cycle is a path that starts and ends at the same node.

For example, consider the following graph represented by edges = [3,3,4,2,3]. The longest cycle in the graph is the cycle: 2 -> 4 -> 3 -> 2. The length of this cycle is 3, so 3 is returned.

As another example, consider edges = [2,-1,3,1]. In this case, there are no cycles in this graph, so you would return -1.

Write a function to find the longest cycle in a directed graph. Your solution should be efficient for graphs with up to 10^5 nodes. Describe the time and space complexity of your solution.

Sample Answer
## Naive Approach

A brute-force approach would involve iterating through each node and performing a Depth-First Search (DFS) to detect cycles. For each node, we start a DFS, keeping track of the nodes visited in the current path. If we encounter a node already visited in the current path, we have found a cycle. We record the length of the cycle and continue the process for all nodes.

```python
def longestCycle_naive(edges):
    n = len(edges)
    max_cycle_length = -1

    for start_node in range(n):
        visited = [False] * n
        recursion_stack = [False] * n
        path_length = 0
        current_node = start_node

        while current_node != -1 and not visited[current_node]:
            visited[current_node] = True
            recursion_stack[current_node] = True
            path_length += 1

            next_node = edges[current_node]

            if next_node != -1:
                if recursion_stack[next_node]:
                    cycle_length = path_length
                    temp_node = current_node
                    count = 1
                    while temp_node != next_node:
                        temp_node = edges[temp_node]
                        count += 1
                    max_cycle_length = max(max_cycle_length, count)
                    break

            current_node = next_node

        # Reset recursion stack for the next DFS
        for i in range(n):
            recursion_stack[i] = False

    return max_cycle_length

Optimal Approach

To optimize the solution, we can use a modified DFS that keeps track of the path length during traversal. We maintain a visited array and a path_length dictionary. When we revisit a node that's already in the current path, we've found a cycle. The length of the cycle is the difference between the current path length and the path length of the revisited node.

def longestCycle(edges):
    n = len(edges)
    visited = [False] * n
    max_cycle_length = -1

    for start_node in range(n):
        if not visited[start_node]:
            path_length = {}
            stack = [(start_node, 0)]  # (node, distance)

            while stack:
                node, distance = stack.pop()

                if visited[node]:
                    continue
                
                visited[node] = True
                path_length[node] = distance
                
                neighbor = edges[node]

                if neighbor != -1:
                    if neighbor in path_length:
                        cycle_length = distance + 1 - path_length[neighbor]
                        max_cycle_length = max(max_cycle_length, cycle_length)
                    elif not visited[neighbor]:
                        stack.append((neighbor, distance + 1))

    return max_cycle_length

Example

Consider the input edges = [3,3,4,2,3]. The algorithm will iterate through each node.

  • Starting from node 0, the path is 0 -> 3. No cycle found.
  • Starting from node 1, the path is 1 -> 3. No cycle found.
  • Starting from node 2, the path is 2 -> 4 -> 3 -> 2. A cycle is found (2 -> 4 -> 3 -> 2) with length 3.
  • Starting from node 3, the path is 3 -> 3. A cycle is found (3 -> 3) with length 1.
  • Starting from node 4, the path is 4 -> 3. No cycle found.

The longest cycle has length 3.

Big(O) Time Complexity Analysis

The time complexity of the optimal solution is O(N), where N is the number of nodes in the graph. This is because each node is visited at most once during the DFS traversal. The visited array ensures that we don't process a node more than once. Although there's a loop inside the DFS to construct the cycle length, the total number of iterations across all calls to DFS remains proportional to N.

Big(O) Space Complexity Analysis

The space complexity of the optimal solution is O(N). This is due to the visited array, path_length dictionary, and the stack used in DFS. In the worst-case scenario, the stack can contain all the nodes if the graph is a single path, and the path_length dictionary can store the distances for each node visited.

Edge Cases and Considerations

  1. Disconnected graph: The graph may consist of multiple disconnected components. The algorithm handles this by iterating through all nodes and starting a DFS from each unvisited node.
  2. No cycles: If the graph contains no cycles, the algorithm correctly returns -1.
  3. Self-loops: If a node points to itself (e.g., edges[i] == i), it represents a cycle of length 1. The algorithm correctly identifies this.
  4. Cycles of different lengths: The graph may contain multiple cycles of different lengths. The algorithm finds the longest cycle among them.
  5. Large graph: For very large graphs (e.g., N > 10^5), the algorithm's space complexity should be considered, as the visited array and path_length dictionary consume O(N) space.