Taro Logo

Redundant Connection

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+3
More companies
Profile picture
Profile picture
Profile picture
16 views
Topics:
Graphs

In this problem, a tree is an undirected graph that is connected and has no cycles.

You are given a graph that started as a tree with n nodes labeled from 1 to n, with one additional edge added. The added edge has two different vertices chosen from 1 to n, and was not an edge that already existed. The graph is represented as an array edges of length n where edges[i] = [ai, bi] indicates that there is an edge between nodes ai and bi in the graph.

Return an edge that can be removed so that the resulting graph is a tree of n nodes. If there are multiple answers, return the answer that occurs last in the input.

Example 1:

Input: edges = [[1,2],[1,3],[2,3]]
Output: [2,3]

Example 2:

Input: edges = [[1,2],[2,3],[3,4],[1,4],[1,5]]
Output: [1,4]

Constraints:

  • n == edges.length
  • 3 <= n <= 1000
  • edges[i].length == 2
  • 1 <= ai < bi <= edges.length
  • ai != bi
  • There are no repeated edges.
  • The given graph is connected.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for the nodes in the graph? Can they be negative or zero?
  2. If there are multiple redundant connections, should I return the one that appears last in the input array?
  3. Will the input graph always be fully connected (i.e., will removing the redundant edge disconnect the graph)?
  4. Is the input guaranteed to represent a valid graph structure (e.g., no self-loops or parallel edges besides the redundant one)?
  5. If the input is empty or null, what should I return?

Brute Force Solution

Approach

The core idea is to go through all possibilities. We remove each connection one by one and check if removing it creates a graph that's still fully connected. If it is, then that connection is the redundant one.

Here's how the algorithm would work step-by-step:

  1. Start with the last connection in the list.
  2. Temporarily remove that connection from the graph.
  3. Check if the graph is still fully connected after removing the connection. This means can you still reach every part of the graph from any starting point.
  4. If the graph is no longer fully connected, put the connection back.
  5. Move to the next-to-last connection in the list and repeat steps two through four.
  6. Keep doing this, moving backwards through the list of connections, until you find a connection that, when removed, leaves the graph fully connected.
  7. The first such connection that you find (going backwards through the list) is the redundant connection you're looking for.

Code Implementation

def find_redundant_connection(edges):
    number_of_edges = len(edges)
    for edge_to_remove_index in range(number_of_edges - 1, -1, -1):
        # Create a copy of the edges to avoid modifying the original
        temp_edges = edges[:edge_to_remove_index] + edges[edge_to_remove_index + 1:]
        
        number_of_nodes = len(edges) + 1
        adjacent_list = [[] for _ in range(number_of_nodes)]
        for node1, node2 in temp_edges:
            adjacent_list[node1].append(node2)
            adjacent_list[node2].append(node1)
        
        # Check connectivity using Depth First Search
        visited = [False] * number_of_nodes
        def depth_first_search(node):
            visited[node] = True
            for neighbor in adjacent_list[node]:
                if not visited[neighbor]:
                    depth_first_search(neighbor)
        
        # Start DFS from node 1
        depth_first_search(1)

        # Check if all nodes are visited
        is_connected = all(visited[1:])
        if is_connected:
            # Found the redundant edge
            return edges[edge_to_remove_index]
    
    return None

Big(O) Analysis

Time Complexity
O(n*(n+m))The algorithm iterates through the list of edges in reverse order which is O(m) where m is the number of edges. For each edge removed, it performs a connectivity check. The connectivity check can be done using Depth First Search (DFS) or Breadth First Search (BFS). In the worst case, DFS or BFS can visit all the nodes in the graph, so this takes O(n) time, where n is the number of nodes. Each of these n nodes can have at most n neighbors, and each of these m edges can be checked for connectivity, which gives us O(m*n). Therefore, the total time complexity is O(m * n) * O(1) because we check a maximum of 'm' edges. This simplifies to O(n*(n+m)) because 'm' can at most be 'n'. Therefore, for a fully connected graph, m is nearly n squared, but the algorithm uses the simpler connectivity check approach for each possible redundant edge.
Space Complexity
O(N)The provided solution's space complexity stems primarily from the 'check if the graph is still fully connected' step, which necessitates tracking visited nodes. A common approach for this, such as using Depth First Search or Breadth First Search, employs a visited set or array to prevent cycles, and this structure can grow up to the number of nodes in the graph. Since each connection in the input represents nodes, in the worst case where all nodes are connected, the number of nodes is related to the number of connections, which we can represent as N. Thus, the space used for the visited set/array is proportional to N, leading to a space complexity of O(N).

Optimal Solution

Approach

We can use a structure that keeps track of different groups or sets, where each item in the problem initially starts in its own group. When we encounter a line, we check if the items connected by that line are already in the same group. If they are, that line is redundant and can be removed.

Here's how the algorithm would work step-by-step:

  1. Start by considering each item as being in its own separate group.
  2. Go through the lines one by one.
  3. For each line, check if the two items connected by that line are already in the same group.
  4. If the two items are already in the same group, it means adding this line would create a cycle and is therefore redundant; save this line.
  5. If the two items are in different groups, merge their groups together to show they are now connected.
  6. Continue this process until all lines have been examined.
  7. The last redundant line you saved is the one you should return.

Code Implementation

def find_redundant_connection(edges):
    number_of_nodes = len(edges)
    parent = list(range(number_of_nodes + 1))

    def find(node):
        if parent[node] != node:
            parent[node] = find(parent[node])
        return parent[node]

    def union(node_one, node_two):
        root_one = find(node_one)
        root_two = find(node_two)
        if root_one != root_two:
            parent[root_one] = root_two
            return True
        return False

    redundant_edge = None
    for edge in edges:
        node_one, node_two = edge[0], edge[1]
        # If nodes are already in same set,
        # then this edge is redundant.
        if not union(node_one, node_two):
            redundant_edge = edge

    # Return the last redundant edge found.
    return redundant_edge

Big(O) Analysis

Time Complexity
O(nα(n))The algorithm iterates through each of the n edges in the input array. For each edge, it performs two find operations and potentially a union operation using a disjoint set data structure (also known as union-find). The find and union operations using path compression and union by rank have an amortized time complexity of α(n), where α(n) is the inverse Ackermann function, which grows extremely slowly and is practically constant for all realistic input sizes. Therefore, the overall time complexity is O(nα(n)), which is very close to O(n).
Space Complexity
O(N)The algorithm uses a data structure (group/set representation) to keep track of which items belong to the same group. In the worst case, each item initially belongs to its own group, so we might need to store group information for each of the N nodes in the graph, where N is the number of nodes represented in the edges. Merging groups involves updating this group information, which operates on a data structure proportional to N. Therefore, the auxiliary space used is O(N).

Edge Cases

CaseHow to Handle
Empty input graph (no edges)Return an empty list or null since there's no redundant connection.
Graph with only two nodes and one edgeReturn the given edge as there can be no cycle with only one edge.
Graph with no redundant connection (already a tree)If the algorithm doesn't find a cycle, return an empty list or null indicating no redundant edge.
Large number of nodes, approaching memory limitsEnsure the chosen data structure (e.g., Disjoint Set Union) scales well and doesn't cause memory overflow by considering alternative representations for very large graphs.
Nodes are not numbered sequentially starting from 1Map node names to sequential indices within the algorithm, then reverse the mapping in the result.
Multiple possible redundant edges; which one to return?The problem statement defines the order of edges in the input determines which edge is redundant.
Integer overflow when calculating union or find operations if node values are excessively largeConsider using a different data type (e.g., long) to store node values or implement a hash-based solution if node values are arbitrarily large and sparse.
Input contains self-loops (edge from a node to itself)The Disjoint Set Union solution handles this case correctly as the find operation will detect it as a cycle.