Taro Logo

K Closest Points to Origin

Medium
Google logo
Google
1 view
Topics:
Arrays

Given an array of points where points[i] = [x_i, y_i] represents a point on the X-Y plane, and an integer k, return the k closest points to the origin (0, 0). The distance between two points on the X-Y plane is the Euclidean distance (i.e., √((x_1 - x_2)^2 + (y_1 - y_2)^2)). The answer can be returned in any order. The answer is guaranteed to be unique (except for the order). Let's illustrate this with a couple of examples.

Example 1: points = [[1,3],[-2,2]], k = 1 In this case, the distance between (1, 3) and the origin is √(1^2 + 3^2) = √10, and the distance between (-2, 2) and the origin is √((-2)^2 + 2^2) = √8. Since √8 < √10, the closest point to the origin is (-2, 2). Therefore, the output should be [[-2,2]].

Example 2: points = [[3,3],[5,-1],[-2,4]], k = 2 Here, we need to find the two closest points to the origin. The distances are:

  • (3, 3): √(3^2 + 3^2) = √18
  • (5, -1): √(5^2 + (-1)^2) = √26
  • (-2, 4): √((-2)^2 + 4^2) = √20

The two smallest distances are √18 and √20, corresponding to points (3, 3) and (-2, 4). Thus, the output should be [[3,3],[-2,4]]. Note that [[-2,4],[3,3]] would also be a correct answer.

How would you efficiently implement a function to solve this problem, considering time and space complexity? Consider the constraints:

  • 1 <= k <= points.length <= 10^4
  • -10^4 <= x_i, y_i <= 10^4

Solution


K Closest Points to Origin

Problem Description

Given an array of points points where points[i] = [x_i, y_i] represents a point on the X-Y plane and an integer k, the goal is to return the k closest points to the origin (0, 0). The distance between two points on the X-Y plane is the Euclidean distance. The answer can be returned in any order.

Brute Force Solution

A naive solution involves calculating the distance of each point from the origin, storing these distances along with the points, sorting the distances, and then selecting the k points with the smallest distances.

  1. Calculate the Euclidean distance for each point from the origin (0, 0).
  2. Store the points along with their distances in a list.
  3. Sort the list based on the distances.
  4. Return the first k points from the sorted list.

Code (Python)

import math

def k_closest_brute_force(points, k):
    distances = []
    for point in points:
        dist = math.sqrt(point[0]**2 + point[1]**2)
        distances.append((dist, point))
    
    distances.sort(key=lambda x: x[0])
    
    result = [point for dist, point in distances[:k]]
    return result

Time Complexity: O(N log N)

  • Calculating distances: O(N)
  • Sorting: O(N log N)
  • Selecting k points: O(k) which is O(N) in the worst case

Space Complexity: O(N)

  • Storing distances and points: O(N)

Optimal Solution (Using a Max Heap)

An optimal solution involves using a max heap (priority queue) of size k to store the k closest points seen so far. For each new point, we compare its distance to the distance of the farthest point in the heap (root of the max heap). If the new point is closer, we remove the farthest point and add the new point to the heap.

  1. Initialize a max heap of size k.
  2. Iterate through the points array.
  3. For each point, calculate its distance from the origin.
  4. If the heap size is less than k, add the point to the heap.
  5. Otherwise, compare the distance of the current point with the distance of the farthest point in the heap (root).
  6. If the current point is closer, remove the root and add the current point to the heap.
  7. After iterating through all the points, the heap will contain the k closest points.

Code (Python)

import heapq
import math

def k_closest_optimal(points, k):
    heap = []
    for point in points:
        dist = -(point[0]**2 + point[1]**2)  # Use negative distance for max heap
        if len(heap) == k:
            heapq.heappushpop(heap, (dist, point))
        else:
            heapq.heappush(heap, (dist, point))
    
    result = [point for dist, point in heap]
    return result

Time Complexity: O(N log k)

  • Iterating through all points: O(N)
  • Heap push/pop operations: O(log k)

Space Complexity: O(k)

  • Storing k closest points in the heap: O(k)

Edge Cases

  • If k is equal to the number of points, simply return all points.
  • If there are duplicate points, the algorithm will still work correctly.
  • If any point is the origin (0,0) it will be selected as one of the closest if k allows it.

Summary

The optimal solution using a max heap provides a more efficient way to find the k closest points to the origin compared to the brute force approach. It reduces the time complexity from O(N log N) to O(N log k), which is significantly better when k is much smaller than N.