Taro Logo

Critical Connections in a Network

Hard
Amazon logo
Amazon
4 views
Topics:
Graphs

You are given n servers numbered from 0 to n - 1 connected by undirected server-to-server connections forming a network where connections[i] = [a<sub>i</sub>, b<sub>i</sub>] represents a connection between servers a<sub>i</sub> and b<sub>i</sub>. Any server can reach other servers directly or indirectly through the network.

A critical connection is a connection that, if removed, will make some servers unable to reach some other server.

Return all critical connections in the network in any order.

Example 1:

Consider a network of 4 servers with the following connections: n = 4, connections = [[0,1],[1,2],[2,0],[1,3]]

In this case, the critical connection is [[1,3]]. Removing this connection would disconnect server 3 from the rest of the network. Note that [[3,1]] is also an acceptable answer.

Example 2:

Consider a network of 2 servers with a single connection: n = 2, connections = [[0,1]]

In this case, the critical connection is [[0,1]]. Removing this connection would disconnect server 0 from server 1.

Write a function that takes the number of servers n and a list of connections connections as input, and returns a list of all critical connections in the network.

Your solution should be efficient and handle various network configurations. Consider edge cases such as empty networks, single connections, and fully connected networks.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for 'n' (number of servers), and for the server IDs 'u' and 'v' in connections[i]?
  2. Can the input 'connections' contain duplicate connections (e.g., [1, 2] and [2, 1] both present), and if so, how should I handle them?
  3. Is the graph guaranteed to be connected initially, or might there be disconnected components?
  4. If there are no critical connections in the graph, what should I return (e.g., an empty list, null)?
  5. Is there a specific ordering required for the list of critical connections in the output (e.g., sorted by 'u', then 'v')?

Brute Force Solution

Approach

Imagine a network of computers, and we want to find the critical connections. The brute force approach involves systematically breaking each connection one at a time and checking if the network remains fully connected. If removing a connection disconnects the network, that connection is critical.

Here's how the algorithm would work step-by-step:

  1. Start with the original intact network.
  2. Consider each connection in the network, one at a time.
  3. For the current connection being considered, temporarily remove it from the network.
  4. Check if all the computers in the network can still communicate with each other after removing that connection. Essentially, see if the network is still connected.
  5. If the network is no longer connected after removing that connection, then that connection is a critical one. Mark it down.
  6. Put the connection back into the network to restore it to its original state.
  7. Repeat the process for every connection in the network.

Code Implementation

def find_critical_connections_brute_force(number_of_nodes, connections):
    critical_connections = []

    for current_connection_index in range(len(connections)):
        # Iterate through each connection to test if it is critical.
        connection_copy = connections[:]
        removed_connection = connection_copy.pop(current_connection_index)

        # Determine if the graph is still connected after removal.
        if not is_graph_connected(number_of_nodes, connection_copy):
            # If not connected, mark the connection as critical.
            critical_connections.append(removed_connection)

    return critical_connections

def is_graph_connected(number_of_nodes, connections):
    if not connections:
        return number_of_nodes <= 1

    graph = build_graph(number_of_nodes, connections)
    start_node = connections[0][0]
    visited = set()
    dfs(start_node, graph, visited)

    # Check if all nodes are visited after DFS.
    return len(visited) == number_of_nodes

def build_graph(number_of_nodes, connections):
    graph = {node: [] for node in range(number_of_nodes)}
    for node1, node2 in connections:
        graph[node1].append(node2)
        graph[node2].append(node1)
    return graph

def dfs(node, graph, visited):
    visited.add(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
			# Recursively explore unvisited neighbors.
            dfs(neighbor, graph, visited)

Big(O) Analysis

Time Complexity
O(E * (V + E))The algorithm iterates through each edge (E) in the network. For each edge, it removes the edge and performs a connectivity check. A connectivity check can be done using Depth-First Search (DFS) or Breadth-First Search (BFS), which takes O(V + E) time, where V is the number of vertices (computers) and E is the number of edges (connections). Since we perform this connectivity check for each of the E edges, the overall time complexity becomes O(E * (V + E)).
Space Complexity
O(N + E)The space complexity is determined by the auxiliary space used in the connectivity check after temporarily removing each edge. A common approach to check connectivity is using Depth-First Search (DFS) or Breadth-First Search (BFS), which require storing the adjacency list representation of the graph, taking O(N + E) space, where N is the number of nodes (computers) and E is the number of edges (connections). The DFS/BFS algorithm also needs a visited set/array of size N to keep track of visited nodes during the connectivity check. Therefore, the overall space complexity is O(N + E) due to the adjacency list and the visited array.

Optimal Solution

Approach

The goal is to find the essential connections in a network that, if removed, would disconnect the network. The approach involves exploring the network to understand its structure, finding those links which are the only way to get between certain parts.

Here's how the algorithm would work step-by-step:

  1. Imagine exploring the network, going from one point to another, keeping track of when you first visit each point.
  2. As you explore, keep track of the earliest point you can reach from each point, including going back through the path you just took.
  3. A connection is critical if the earliest point you can reach from one end is the point you started from on that end. This means there's no other way to reach an earlier point, making it essential.
  4. Go through each connection. If removing it would disconnect the network, then that is a critical connection.

Code Implementation

def find_critical_connections(number_of_nodes, connections):
    graph = [[] for _ in range(number_of_nodes)]
    for node_a, node_b in connections:
        graph[node_a].append(node_b)
        graph[node_b].append(node_a)

    discovery_times = [-1] * number_of_nodes
    lowest_reachable_times = [-1] * number_of_nodes
    critical_connections = []
    time = 0

    def depth_first_search(node, parent):
        nonlocal time
        discovery_times[node] = time
        lowest_reachable_times[node] = time
        time += 1

        for neighbor in graph[node]:
            if neighbor == parent:
                continue

            if discovery_times[neighbor] == -1:
                # Explore the neighbor if it hasn't been visited
                depth_first_search(neighbor, node)
                lowest_reachable_times[node] = min(
                    lowest_reachable_times[node],
                    lowest_reachable_times[neighbor],
                )

                # Check for critical connection
                if lowest_reachable_times[neighbor] > discovery_times[node]:

                    critical_connections.append([node, neighbor])

            else:
                # Update lowest reachable time if neighbor was already visited
                lowest_reachable_times[node] = min(
                    lowest_reachable_times[node],
                    discovery_times[neighbor],
                )

    # Start DFS from node 0
    depth_first_search(0, -1)

    return critical_connections

Big(O) Analysis

Time Complexity
O(V + E)The algorithm performs a Depth-First Search (DFS) traversal of the graph. The DFS explores each vertex (node) once and each edge (connection) once in order to determine the critical connections. Visiting each vertex takes O(V) time, and examining each edge during the DFS takes O(E) time. Therefore, the overall time complexity is determined by the combined cost of visiting vertices and edges, resulting in O(V + E), where V is the number of vertices and E is the number of edges in the graph.
Space Complexity
O(N)The algorithm uses auxiliary space for several data structures. First, to keep track of when each point is first visited, an array of size N (where N is the number of nodes in the network) is required. Similarly, to track the earliest reachable point from each point, another array of size N is needed. The depth-first search recursion stack can also grow up to N in the worst-case scenario, where the graph resembles a linked list. Therefore, the total auxiliary space complexity is O(N).

Edge Cases

CaseHow to Handle
Empty connections listReturn an empty list of critical connections as there are no edges to analyze.
Single server (n=1) with no connectionsReturn an empty list as there are no connections, hence no critical connections.
Fully connected graph (clique)Return an empty list as removing any single edge won't disconnect the graph.
Graph with parallel edges (multiple connections between two servers)The algorithm should handle parallel edges correctly, treating them as multiple connections between the same two nodes and only identifying bridges.
Disconnected graph (multiple components)The algorithm should identify critical connections within each connected component separately, returning all such connections.
Large number of servers and connections (scalability)Use an efficient algorithm like Tarjan's algorithm with O(V+E) time complexity to ensure it scales well.
Graph with self-loops (connection from a server to itself)Self-loops are not critical connections and should be ignored by the algorithm.
Input graph contains duplicate connections.The algorithm should remove duplicate connections to avoid processing the same edge multiple times, improving efficiency.