Taro Logo

K Closest Points to Origin

Medium
Meta logo
Meta
0 views
Topics:
ArraysGreedy Algorithms

Given an array of points where points[i] = [x<sub>i</sub>, y<sub>i</sub>] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0). The distance between two points on the X-Y plane is the Euclidean distance (i.e., √(x<sub>1</sub> - x<sub>2</sub>)<sup>2</sup> + (y<sub>1</sub> - y<sub>2</sub>)<sup>2</sup>).

  1. Describe a brute force solution to this problem. What is the Big O time and space complexity?
  2. Describe an optimal solution using a heap. What is the Big O time and space complexity?
  3. How would you handle edge cases such as:
    • Empty input?
    • k = 0?
    • k > number of points?

For example:

Input: points = [[1,3],[-2,2]], k = 1 Output: [[-2,2]]

Input: points = [[3,3],[5,-1],[-2,4]], k = 2 Output: [[3,3],[-2,4]]

Solution


K Closest Points to Origin

Problem Description

Given an array of points where points[i] = [x<sub>i</sub>, y<sub>i</sub>] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0). The distance between two points on the X-Y plane is the Euclidean distance.

You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

Naive Solution

A naive solution would be to calculate the distance of each point from the origin, store the distances along with the points, sort the distances, and then return the k points with the smallest distances.

Code (Python)

import math

def k_closest_naive(points, k):
    distances = []
    for x, y in points:
        dist = math.sqrt(x**2 + y**2)
        distances.append((dist, [x, y]))
    
    distances.sort()
    
    result = []
    for i in range(k):
        result.append(distances[i][1])
    
    return result

Time Complexity

O(N log N), where N is the number of points. This is dominated by the sorting step.

Space Complexity

O(N) to store the distances and points.

Optimal Solution

A more efficient solution would be to use a max-heap (priority queue). We maintain a max-heap of size k that stores the k closest points seen so far. For each new point, we calculate its distance from the origin. If the distance is smaller than the distance of the farthest point in the heap (i.e., the root of the max-heap), we remove the farthest point and add the new point to the heap. This way, we always keep the k closest points in the heap.

Code (Python)

import heapq
import math

def k_closest_optimal(points, k):
    heap = []
    for x, y in points:
        dist = -(x**2 + y**2)  # Negate for max-heap
        if len(heap) == k:
            heapq.heappushpop(heap, (dist, [x, y]))
        else:
            heapq.heappush(heap, (dist, [x, y]))
            
    result = []
    for dist, point in heap:
        result.append(point)
        
    return result

Time Complexity

O(N log K), where N is the number of points and K is the number of closest points to return. For each of the N points, we potentially perform a heap push/pop operation, which takes O(log K) time.

Space Complexity

O(K) to store the heap.

Edge Cases

  • Empty input: If the input array points is empty, return an empty array.
  • k = 0: If k is 0, return an empty array.
  • k > number of points: If k is greater than the number of points, return all the points.

These cases are already implicitly handled by the code provided above due to problem constraints.