You are given an array points
representing integer coordinates of some points on a 2D-plane, where points[i] = [x_i, y_i]
. The cost of connecting two points [x_i, y_i]
and [x_j, y_j]
is the manhattan distance between them: |x_i - x_j| + |y_i - y_j|
, where |val|
denotes the absolute value of val
.
Return the minimum cost to make all points connected. All points are connected if there is exactly one simple path between any two points.
Example 1:
Input: points = [[0,0],[2,2],[3,10],[5,2],[7,0]]
Output: 20
Explanation: We can connect the points as shown above to get the minimum cost of 20.
Notice that there is a unique path between every pair of points.
Example 2:
Input: points = [[3,12],[-2,5],[-4,1]]
Output: 18
Can you provide an algorithm to efficiently solve this problem? Consider edge cases and analyze time and space complexity.
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The goal is to connect all points with the lowest possible total cost. The brute force approach considers every possible way to connect the points, calculating the cost of each way, and then finds the minimum cost among all of them.
Here's how the algorithm would work step-by-step:
def min_cost_connect_all_points_brute_force(points):
number_of_points = len(points)
# Calculate the manhattan distance between all pairs of points
all_edges = []
for i in range(number_of_points):
for j in range(i + 1, number_of_points):
distance = abs(points[i][0] - points[j][0]) + abs(points[i][1] - points[j][1])
all_edges.append((i, j, distance))
minimum_cost = float('inf')
# Iterate through all possible subsets of edges
for i in range(2 ** len(all_edges)):
current_cost = 0
selected_edges = []
for j in range(len(all_edges)):
if (i >> j) & 1:
selected_edges.append(all_edges[j])
current_cost += all_edges[j][2]
# Check if the selected edges connect all points
if is_connected(number_of_points, selected_edges):
# Update minimum_cost if current_cost is smaller
minimum_cost = min(minimum_cost, current_cost)
return minimum_cost
def is_connected(number_of_points, edges):
# Use Depth First Search to check connectivity
if not edges:
return number_of_points <= 1
graph = {i: [] for i in range(number_of_points)}
for start, end, _ in edges:
graph[start].append(end)
graph[end].append(start)
visited = [False] * number_of_points
#Start at node 0 and traverse
dfs(graph, 0, visited)
# Check if all nodes were visited
return all(visited)
def dfs(graph, node, visited):
visited[node] = True
for neighbor in graph[node]:
if not visited[neighbor]:
dfs(graph, neighbor, visited)
The goal is to connect all points with the lowest possible total cost. We use a method that gradually builds the network of connections, always choosing the cheapest connection that doesn't create a closed loop.
Here's how the algorithm would work step-by-step:
def min_cost_connect_points(points):
number_of_points = len(points)
edges = []
for i in range(number_of_points):
for j in range(i + 1, number_of_points):
distance = abs(points[i][0] - points[j][0]) + abs(points[i][1] - points[j][1])
edges.append((distance, i, j))
edges.sort()
parent = list(range(number_of_points))
def find(point_index):
if parent[point_index] != point_index:
parent[point_index] = find(parent[point_index])
return parent[point_index]
def union(point_index_one, point_index_two):
root_one = find(point_index_one)
root_two = find(point_index_two)
parent[root_one] = root_two
minimum_cost = 0
edges_used = 0
for distance, point_index_one, point_index_two in edges:
# Avoid cycles by checking if the points are already in the same group.
if find(point_index_one) != find(point_index_two):
minimum_cost += distance
# Merge the two groups that the points belong to.
union(point_index_one, point_index_two)
edges_used += 1
if edges_used == number_of_points - 1:
break
# Ensures all nodes are connected
return minimum_cost
Case | How to Handle |
---|---|
Empty input array | Return 0 since no points need to be connected, implying zero cost. |
Input array with only one point | Return 0 since a single point is already considered connected to itself. |
Large input array (n > 1000) approaching maximum constraint | Kruskal's or Prim's algorithm with efficient union-find or min-heap implementations, respectively, should be used to ensure O(n^2 log n) or O(n^2) time complexity, preventing timeout. |
Points with identical coordinates | The Manhattan distance will be zero, which should be correctly handled by the minimum spanning tree algorithm, and will be prioritized to include this edge early on. |
Integer overflow when calculating Manhattan distance for extremely large coordinates | Use 64-bit integers (long long in C++, long in Java) to store the Manhattan distance to prevent overflow. |
Negative coordinates in the input points | The Manhattan distance calculation |x1 - x2| + |y1 - y2| handles negative coordinates correctly due to the absolute value. |
All points are located on the same line (collinear) | Minimum spanning tree algorithms like Kruskal's or Prim's will still find the minimum cost to connect all points, forming a single path along the line. |
Duplicate edges in the edge list for Kruskal's due to different points having the same Manhattan distance | The sorting of edges by cost in Kruskal's algorithm will handle these duplicates correctly, and the union-find data structure will prevent cycles. |