Taro Logo

K Closest Points to Origin

Medium
Apple logo
Apple
Topics:
Arrays

Given an array of points where points[i] = [x_i, y_i] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0). The distance between two points on the X-Y plane is the Euclidean distance (i.e., sqrt((x1 - x2)^2 + (y1 - y2)^2). You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

For example:

points = [[1,3],[-2,2]], k = 1

Output: [[-2,2]]

Explanation: The distance between (1, 3) and the origin is sqrt(10). The distance between (-2, 2) and the origin is sqrt(8). Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin. We only want the closest k = 1 points from the origin, so the answer is just [[-2,2]].

As another example:

points = [[3,3],[5,-1],[-2,4]], k = 2

Output: [[3,3],[-2,4]]

Can you provide an algorithm with optimal time complexity and space complexity? Please consider edge cases, like empty input, k = 0, and k = len(points). Also, can you briefly explain the time and space complexity of your solution?

Solution


K Closest Points to Origin

Problem

Given an array of points points where points[i] = [x_i, y_i] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0). The distance between two points on the X-Y plane is the Euclidean distance (i.e., sqrt((x1 - x2)^2 + (y1 - y2)^2). You may return the answer in any order.

Naive Solution

The naive solution involves calculating the distance of each point from the origin, storing these distances along with the corresponding points, sorting the distances, and then picking the first k points.

  1. Calculate the Euclidean distance for each point from the origin (0, 0).
  2. Store the distances along with their corresponding points in a list or dictionary.
  3. Sort the list of distances.
  4. Pick the first k points associated with the smallest distances.

Code (Python):

import math

def k_closest_naive(points, k):
    distances = []
    for x, y in points:
        dist = math.sqrt(x**2 + y**2)
        distances.append((dist, [x, y]))
    
    distances.sort(key=lambda item: item[0])
    
    result = []
    for i in range(k):
        result.append(distances[i][1])
    
    return result

Time Complexity: O(N log N), due to the sorting step.

Space Complexity: O(N), to store the distances and points.

Optimal Solution

An optimal solution can be achieved using a max-heap (priority queue). This allows us to maintain the k closest points seen so far. The idea is to iterate through the points, and for each point, calculate its distance from the origin. If the heap has fewer than k elements, we add the point to the heap. If the heap is full, we compare the distance of the current point with the distance of the farthest point in the heap (the root of the max-heap). If the current point is closer, we remove the farthest point and add the current point.

  1. Initialize a max-heap of size k.
  2. Iterate through the points.
    • Calculate the distance of the point from the origin.
    • If the heap size is less than k, add the point (and its negative distance as the key for max-heap).
    • Otherwise, if the distance is less than the largest distance in the heap, remove the largest distance and add the current point.
  3. The heap now contains the k closest points.

Code (Python):

import heapq
import math

def k_closest_optimal(points, k):
    heap = []
    for x, y in points:
        dist = -(x**2 + y**2)  # Negative for max-heap
        if len(heap) == k:
            heapq.heappushpop(heap, (dist, [x, y]))
        else:
            heapq.heappush(heap, (dist, [x, y]))
    
    result = [point for (dist, point) in heap]
    return result

Time Complexity: O(N log K), where N is the number of points. For each point, we might perform a heap operation which takes O(log K) time.

Space Complexity: O(K), to store the k closest points in the heap.

Edge Cases

  • Empty input: If the input points array is empty, return an empty array.
  • k = 0: If k is 0, return an empty array.
  • k = len(points): If k is equal to the number of points, return the entire input array (though sorting by distance is not necessary).

Summary

The optimal solution using a max-heap provides a better time complexity of O(N log K) compared to the naive approach of O(N log N). The space complexity is O(K) for the optimal solution, while it is O(N) for the naive solution. The max-heap approach efficiently maintains the k closest points seen so far, making it a suitable choice for this problem.