Taro Logo

Graph Connectivity With Threshold

Hard
Asked by:
Profile picture
12 views
Topics:
ArraysGraphs

We have n cities labeled from 1 to n. Two different cities with labels x and y are directly connected by a bidirectional road if and only if x and y share a common divisor strictly greater than some threshold. More formally, cities with labels x and y have a road between them if there exists an integer z such that all of the following are true:

  • x % z == 0,
  • y % z == 0, and
  • z > threshold.

Given the two integers, n and threshold, and an array of queries, you must determine for each queries[i] = [ai, bi] if cities ai and bi are connected directly or indirectly. (i.e. there is some path between them).

Return an array answer, where answer.length == queries.length and answer[i] is true if for the ith query, there is a path between ai and bi, or answer[i] is false if there is no path.

Example 1:

Input: n = 6, threshold = 2, queries = [[1,4],[2,5],[3,6]]
Output: [false,false,true]
Explanation: The divisors for each number:
1:   1
2:   1, 2
3:   1, 3
4:   1, 2, 4
5:   1, 5
6:   1, 2, 3, 6
Using the underlined divisors above the threshold, only cities 3 and 6 share a common divisor, so they are the
only ones directly connected. The result of each query:
[1,4]   1 is not connected to 4
[2,5]   2 is not connected to 5
[3,6]   3 is connected to 6 through path 3--6

Example 2:

Input: n = 6, threshold = 0, queries = [[4,5],[3,4],[3,2],[2,6],[1,3]]
Output: [true,true,true,true,true]
Explanation: The divisors for each number are the same as the previous example. However, since the threshold is 0,
all divisors can be used. Since all numbers share 1 as a divisor, all cities are connected.

Example 3:

Input: n = 5, threshold = 1, queries = [[4,5],[4,5],[3,2],[2,3],[3,4]]
Output: [false,false,false,false,false]
Explanation: Only cities 2 and 4 share a common divisor 2 which is strictly greater than the threshold 1, so they are the only ones directly connected.
Please notice that there can be multiple queries for the same pair of nodes [x, y], and that the query [x, y] is equivalent to the query [y, x].

Constraints:

  • 2 <= n <= 104
  • 0 <= threshold <= n
  • 1 <= queries.length <= 105
  • queries[i].length == 2
  • 1 <= ai, bi <= cities
  • ai != bi

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for `n` and `threshold`? Can `threshold` ever be greater than or equal to `n`?
  2. Are the node labels guaranteed to be in the range of 1 to `n` inclusive, or could they be outside that range?
  3. Is the graph undirected, meaning that if city `i` and city `j` are connected, then city `j` and city `i` are also connected?
  4. If all cities are isolated (no connections with the given threshold), what should the return value be?
  5. Is the threshold inclusive or exclusive? Meaning, should I include the greatest common divisor being equal to the threshold or strictly greater than the threshold?

Brute Force Solution

Approach

To solve this graph connectivity problem, the brute force method is like checking if every pair of cities are connected via roads with acceptable speed limits. We will simply check every possible path between cities to see if a connection exists that meets the requirements.

Here's how the algorithm would work step-by-step:

  1. Consider each pair of cities as the start and end points you want to connect.
  2. For each pair of cities, explore every single possible route between them.
  3. When exploring a route, only consider roads that have a speed limit above the given minimum threshold.
  4. If a route between two cities consists only of roads that meet the speed limit, then the two cities are connected.
  5. After checking every possible route between every pair of cities, you will know which cities are connected and which are not.

Code Implementation

def graph_connectivity_brute_force(number_of_cities, speed_limits, required_speed): 
    connected = [[False] * number_of_cities for _ in range(number_of_cities)]

    # Iterate through all possible pairs of cities
    for start_city in range(number_of_cities):
        for end_city in range(start_city + 1, number_of_cities):
            # Explore paths only if start and end are different
            if start_city != end_city:
                if find_path(start_city, end_city, number_of_cities, speed_limits, required_speed, [start_city]):
                    connected[start_city][end_city] = True
                    connected[end_city][start_city] = True

    return connected

def find_path(current_city, end_city, number_of_cities, speed_limits, required_speed, visited):
    # If current city is the destination, path exists
    if current_city == end_city:
        return True

    # Explore neighbors with acceptable speed limits
    for neighbor_city in range(number_of_cities):
        #Check that the neighbor city is not in our current path
        if neighbor_city not in visited:

            speed = speed_limits[current_city][neighbor_city]
            
            # Road must exist and satisfy the speed limit
            if speed > required_speed:
                # Recursive DFS to check the other neighbor cities
                if find_path(neighbor_city, end_city, number_of_cities, speed_limits, required_speed, visited + [neighbor_city]):
                    return True
    return False

Big(O) Analysis

Time Complexity
O(n² * n!)The brute force approach considers all possible pairs of cities, which is O(n²). For each pair of cities, it explores every possible route between them. In the worst case, this could involve checking all permutations of cities as intermediate nodes in a path. Finding all permutations of n cities takes n! time. Therefore, for each of the n² pairs, we have to explore n! possible paths. The total time complexity is O(n² * n!).
Space Complexity
O(N!)The brute force approach explores every possible route between each pair of cities. In the worst-case scenario, to explore all possible routes between two cities, the algorithm might generate all permutations of the cities. If there are N cities, the number of possible routes (permutations) can grow up to N factorial (N!). Thus, a recursion stack might be needed to keep track of the current path being explored. Additionally, a set or list to store visited cities during path exploration might be necessary to avoid cycles, which in the worst case could store all N cities. Therefore, the space complexity would be dominated by the factorial nature of exploring every route, leading to O(N!).

Optimal Solution

Approach

The core idea is to efficiently determine if two cities are connected in a hidden graph. We use a technique that groups cities together into connected regions and then quickly checks if two cities belong to the same region.

Here's how the algorithm would work step-by-step:

  1. Imagine each city starts in its own separate group.
  2. Consider all possible thresholds (the minimum connection strength) between cities.
  3. For a given threshold, connect all pairs of cities whose greatest common divisor is greater than the threshold. This creates bigger and bigger groups of connected cities.
  4. Keep track of which cities belong to which group as you consider different thresholds. An efficient way to do this is by having each city point to a 'representative' city within its group. Initially, each city points to itself.
  5. When connecting cities, make sure to combine the groups they belong to. You can do this by making the representative of one group point to the representative of the other group. To find the representative of a city, keep following the pointers until you reach a city that points to itself.
  6. After processing all possible thresholds, to check if two cities are connected given a specific threshold, find the representative of each city. If their representatives are the same, they belong to the same connected region and are therefore connected.

Code Implementation

def solve():
    def are_connected(number_of_cities, threshold, city_a, city_b):
        parent = list(range(number_of_cities + 1))

        def find(city):
            if parent[city] == city:
                return city
            parent[city] = find(parent[city])
            return parent[city]

        def union(city_x, city_y):
            root_x = find(city_x)
            root_y = find(city_y)
            if root_x != root_y:
                parent[root_x] = root_y

        # Iterate through all city pairs.
        for first_city in range(1, number_of_cities + 1):
            for second_city in range(first_city + 1, number_of_cities + 1):
                # Connect cities if their GCD exceeds threshold.
                if gcd(first_city, second_city) > threshold:
                    union(first_city, second_city)

        # Check connectivity after union operations.
        return find(city_a) == find(city_b)

    def gcd(first_number, second_number):
        while second_number:
            first_number, second_number = second_number, first_number % second_number
        return first_number

    number_of_cities_input = 6
    threshold_input = 2
    queries_input = [[1, 4], [2, 5], [3, 6]]
    
    results = []
    for query in queries_input:
        city_a_query, city_b_query = query
        results.append(are_connected(number_of_cities_input, threshold_input, city_a_query, city_b_query))

    return results

def graph_connectivity_with_threshold(number_of_cities, threshold, queries):
    parent = list(range(number_of_cities + 1))

    def find(city):
        if parent[city] == city:
            return city
        parent[city] = find(parent[city])
        return parent[city]

    def union(city_x, city_y):
        root_x = find(city_x)
        root_y = find(city_y)
        if root_x != root_y:
            parent[root_x] = root_y

    # Iterate through all city pairs.
    for first_city in range(1, number_of_cities + 1):
        for second_city in range(first_city + 1, number_of_cities + 1):
            # Connect cities if their GCD exceeds threshold.
            if gcd(first_city, second_city) > threshold:
                union(first_city, second_city)

    results = []
    for city_a, city_b in queries:
        # Check connectivity after union operations.
        results.append(find(city_a) == find(city_b))

    return results

def gcd(first_number, second_number):
    while second_number:
        first_number, second_number = second_number, first_number % second_number
    return first_number

Big(O) Analysis

Time Complexity
O(n log n + q * alpha(n))The algorithm's complexity is driven by two main parts: finding the greatest common divisor (GCD) for each pair of cities with a threshold greater than 0, and then performing the union-find operations for the connectivity queries. The GCD computation takes O(log n) time per pair of cities, and since we consider all pairs initially, this contributes O(n * n * log n) without optimization. However, the code provided likely does not compute GCDs of *all* city pairs -- it only computes GCDs necessary for establishing connections at or above the threshold. Also, we optimize GCD computations to only happen once during Union operations. The Union-Find operations involve finding the representative of each city and merging the groups. Using path compression and union by rank, the amortized time complexity for each Union-Find operation is O(alpha(n)), where alpha(n) is the inverse Ackermann function, which grows extremely slowly and is practically constant. For 'q' connectivity queries, the Union-Find part takes O(q * alpha(n)). The GCD calculation to initiate the Union is done over all numbers, where GCD has time complexity log(n), so with some optimizations, the construction of connected components is closer to O(n log n) since we consider divisors. Therefore, the overall time complexity is dominated by initial setup O(n log n) + O(q * alpha(n)).
Space Complexity
O(N)The algorithm uses a data structure to represent the connected components of cities. In the worst case, each city starts in its own separate group and a parent array of size N is used, where N is the number of cities, to point to the representative city of each group. Thus the algorithm requires O(N) auxiliary space to store this information. The representative of each city is tracked, needing space proportional to the number of cities, N.

Edge Cases

n is 0 or 1
How to Handle:
Return an empty list if n is 0 or 1, as no edges can be formed.
threshold is greater than n-1
How to Handle:
Return an empty list if the threshold is greater than or equal to n, as no edges can satisfy the condition.
threshold is 0
How to Handle:
Return all possible edges, as all pairs will satisfy the threshold condition.
n is a large number (scalability)
How to Handle:
Union-Find is efficient and suitable for large n, but the number of edges could be n^2 in worst case (threshold = 0), potentially causing memory issues depending on output format.
Integer overflow when calculating gcd
How to Handle:
Use a library gcd function that handles potential overflow or ensure the inputs to the gcd function are appropriately sized to avoid overflow.
All numbers from 1 to n have a gcd greater than the threshold.
How to Handle:
Return all possible edges because all pairs are connected.
No two numbers share a gcd greater than the threshold.
How to Handle:
Return an empty list, as no edges can be formed.
n and threshold are close, only a few edges exist
How to Handle:
The Union-Find data structure should efficiently determine connected components even with a sparse graph.