Taro Logo

Largest Component Size by Common Factor

Hard
Google logo
Google
2 views
Topics:
GraphsArraysGreedy AlgorithmsDynamic Programming

You are given an integer array of unique positive integers nums. Consider the following graph:

  • There are nums.length nodes, labeled nums[0] to nums[nums.length - 1],
  • There is an undirected edge between nums[i] and nums[j] if nums[i] and nums[j] share a common factor greater than 1.

Return the size of the largest connected component in the graph.

For example:

nums = [4,6,15,35] should return 4 because all numbers share a common factor. The numbers are 4, 6, 15, and 35. 4 and 6 share a common factor of 2. 4 and 15 do not. 4 and 35 do not. 6 and 15 share a common factor of 3. 6 and 35 do not. 15 and 35 share a common factor of 5. Thus all nodes connect via common factors.

nums = [20,50,9,63] should return 2 because 20 and 50 share a common factor of 10. 9 and 63 share a common factor of 9. The largest connected component is of size 2.

nums = [2,3,6,7,4,12,21,39] should return 8 because all of them connect.

How would you approach this problem? What is the most efficient way to solve this?

Solution


Brute Force Solution

A naive approach would be to iterate through all pairs of numbers in the input array nums and check if they share a common factor greater than 1. If they do, we consider them connected and build a graph. Then, we traverse the graph to find the largest connected component.

This brute-force approach has a high time complexity because, for each pair of numbers, we need to find their greatest common divisor (GCD). Finding GCD can take O(min(a, b)), where a and b are the two numbers. Constructing the graph takes O(N^2 * log(max(nums))), where N is the number of elements in nums and log(max(nums)) is the time to compute GCD. After building the graph, we would need to use either Breadth First Search (BFS) or Depth First Search (DFS) to find the connected components which takes O(N + E), where N is the number of nodes (nums.length) and E is the number of edges. This approach is not efficient for larger inputs.

  • Time Complexity: O(N^2 * log(max(nums)) + N + E), where N is the number of elements in nums.
  • Space Complexity: O(N + E), where N is the number of nodes and E is the number of edges.

Optimal Solution: Union Find

A more efficient approach involves using the Union Find (Disjoint Set Union) data structure. We first find all the prime factors for each number in the input array nums. Then, for each number, we connect it (union) with all its prime factors. This process effectively builds connected components based on shared prime factors. Finally, we iterate through the nums array and find the size of the largest connected component using the find operation of the Union Find data structure.

Algorithm:

  1. Prime Factorization: Create a function to find all prime factors of a given number. We can optimize this by only iterating up to the square root of the number.
  2. Union Find Initialization: Initialize a Union Find data structure with a size large enough to accommodate all numbers and their prime factors. Since the maximum value of nums[i] is 10^5, we can initialize the Union Find with this size.
  3. Union Operations: Iterate through the nums array. For each number:
    • Find its prime factors.
    • For each prime factor, perform a union operation between the number and the prime factor.
  4. Find Largest Component: Iterate through the nums array again. For each number:
    • Find the representative (root) of its set using the find operation.
    • Maintain a count of the sizes of each connected component.
    • The maximum count will be the size of the largest connected component.

Code Implementation:

class UnionFind:
    def __init__(self, size):
        self.root = list(range(size))
        self.size = [1] * size

    def find(self, x):
        if x != self.root[x]:
            self.root[x] = self.find(self.root[x])  # Path compression
        return self.root[x]

    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
        if root_x != root_y:
            if self.size[root_x] < self.size[root_y]:
                root_x, root_y = root_y, root_x
            self.root[root_y] = root_x
            self.size[root_x] += self.size[root_y]


def largestComponentSize(nums):
    max_val = max(nums)
    uf = UnionFind(max_val + 1)

    for num in nums:
        i = 2
        temp = num
        while i * i <= temp:
            if temp % i == 0:
                uf.union(num, i)
                while temp % i == 0:
                    temp //= i
            i += 1
        if temp > 1:
            uf.union(num, temp)

    component_sizes = {}
    max_size = 0
    for num in nums:
        root = uf.find(num)
        component_sizes[root] = component_sizes.get(root, 0) + 1
        max_size = max(max_size, component_sizes[root])

    return max_size


# Example usage
nums1 = [4, 6, 15, 35]
print(largestComponentSize(nums1))  # Output: 4

nums2 = [20, 50, 9, 63]
print(largestComponentSize(nums2))  # Output: 2

nums3 = [2, 3, 6, 7, 4, 12, 21, 39]
print(largestComponentSize(nums3))  # Output: 8

Explanation:

  1. UnionFind Class: This class implements the Union Find data structure with path compression for optimization.
  2. Prime Factorization: The code iterates through each number num in the input array and finds its prime factors. Instead of checking every number up to num, it checks up to the square root of num to optimize the prime factorization process.
  3. Union Operations: For each prime factor found, it performs a union operation between the number and the prime factor, connecting them in the Union Find data structure.
  4. Largest Component: After connecting all numbers with their prime factors, the code iterates through the input array again. It finds the root (representative) of each number's connected component using the find operation. It then counts the size of each component and determines the largest one.

Big O Analysis:

  • Time Complexity: O(N * sqrt(M) + N * α(N)), where N is the number of elements in nums, M is the maximum value in nums, and α is the inverse Ackermann function (which grows very slowly and can be considered almost constant). The sqrt(M) comes from the prime factorization, and α(N) comes from the Union Find operations with path compression.
  • Space Complexity: O(M), where M is the maximum value in nums, due to the size of the Union Find data structure.

Edge Cases:

  • Input with small numbers: The algorithm handles small numbers efficiently because the prime factorization will quickly find the factors.
  • Input with prime numbers: If an input contains a prime number, it will be unioned with itself. Other numbers can still connect to it if they share that prime factor.
  • Input with no common factors: If the input array contains numbers that do not share any common factors, the algorithm will correctly identify that the largest connected component has a size of 1.
  • Large input numbers: Since the Union Find is initialized with the maximum value in nums, very large numbers could lead to a larger memory footprint, but the algorithm remains correct.

This optimized solution provides a significant improvement in time complexity compared to the brute-force approach, making it suitable for larger input sizes.