Given an array of points
where points[i] = [xi, yi]
represents a point on the X-Y plane and an integer k
, return the k
closest points to the origin (0, 0)
.
The distance between two points on the X-Y plane is the Euclidean distance (i.e., √(x1 - x2)2 + (y1 - y2)2
).
You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).
Example 1:
Input: points = [[1,3],[-2,2]], k = 1 Output: [[-2,2]] Explanation: The distance between (1, 3) and the origin is sqrt(10). The distance between (-2, 2) and the origin is sqrt(8). Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin. We only want the closest k = 1 points from the origin, so the answer is just [[-2,2]].
Example 2:
Input: points = [[3,3],[5,-1],[-2,4]], k = 2 Output: [[3,3],[-2,4]] Explanation: The answer [[-2,4],[3,3]] would also be accepted.
Constraints:
1 <= k <= points.length <= 104
-104 <= xi, yi <= 104
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach to finding the K closest points is like checking the distance of every single point to the origin. Then, we pick the K points with the smallest distances. It's straightforward but not the most efficient way.
Here's how the algorithm would work step-by-step:
def k_closest_points_brute_force(points, k):
distances_with_points = []
# Calculate distance for each point and store it
for point in points:
distance_to_origin = (point[0]**2) + (point[1]**2)
distances_with_points.append((distance_to_origin, point))
# Sort points by distance, essential for picking closest
distances_with_points.sort()
closest_points = []
# Select the k closest points after sorting
for i in range(k):
closest_points.append(distances_with_points[i][1])
return closest_points
The goal is to find the closest points without sorting all of them. We achieve this by focusing on efficiently finding the 'k-th' closest point and then gathering all points closer than or equal to it. This is much faster than sorting everything.
Here's how the algorithm would work step-by-step:
import random
def k_closest_points_to_origin(points, k):
def calculate_distance_to_origin(point):
return point[0]**2 + point[1]**2
def partition(points, left_index, right_index, pivot_index):
pivot_distance = calculate_distance_to_origin(points[pivot_index])
points[pivot_index], points[right_index] = points[right_index], points[pivot_index]
store_index = left_index
for i in range(left_index, right_index):
if calculate_distance_to_origin(points[i]) <= pivot_distance:
points[store_index], points[i] = points[i], points[store_index]
store_index += 1
points[right_index], points[store_index] = points[store_index], points[right_index]
return store_index
def quickselect(points, left_index, right_index, k_smallest_index):
# Only one element, return it.
if left_index == right_index:
return points[left_index]
while True:
pivot_index = random.randint(left_index, right_index)
pivot_index = partition(points, left_index, right_index, pivot_index)
# Found the k-th smallest
if k_smallest_index == pivot_index:
return points[k_smallest_index]
elif k_smallest_index < pivot_index:
right_index = pivot_index - 1
else:
left_index = pivot_index + 1
kth_closest_point = quickselect(points, 0, len(points) - 1, k - 1)
kth_closest_distance = calculate_distance_to_origin(kth_closest_point)
# Gather all points within the k-th distance.
result = []
for point in points:
if calculate_distance_to_origin(point) <= kth_closest_distance:
result.append(point)
# Return the result
return result
Case | How to Handle |
---|---|
Empty input list of points | Return an empty list if the input list is null or empty. |
k is zero | If k is zero, return an empty list as no points are requested. |
k is larger than the number of points | Return all points if k exceeds the number of points. |
Points with identical distances to the origin | The algorithm must handle ties appropriately; it should return any k points with the smallest distances. |
Points with very large x and y coordinates (potential integer overflow) | Use long or double for distance calculation to prevent integer overflow if coordinates are very large. |
Points with negative x and y coordinates | The distance calculation should correctly handle negative coordinates (squaring eliminates the negative sign). |
k equals the total number of points | Return the entire input list without any filtering. |
Very large input list (performance considerations) | A heap-based solution (priority queue) is preferred for optimal performance in selecting the k closest points, avoiding sorting the entire list. |