Given an integer array nums
and an integer k
, return the k<sup>th</sup>
largest element in the array. Note that it is the k<sup>th</sup>
largest element in the sorted order, not the k<sup>th</sup>
distinct element. Can you solve it without sorting?
For example:
nums = [3,2,1,5,6,4], k = 2
Output: 5
nums = [3,2,3,1,2,4,5,5,6], k = 4
Output: 4
Write an efficient algorithm to solve this problem. Consider various approaches and their time and space complexities. What are the edge cases to consider?
Here's how to find the kth largest element in an array. We'll start with a naive approach and then move to a more efficient solution.
The simplest approach is to sort the array in descending order and then return the element at index k-1
. This is because the kth largest element will be at that position after sorting.
Example:
nums = [3, 2, 1, 5, 6, 4], k = 2
nums
in descending order: [6, 5, 4, 3, 2, 1]
k-1 = 2-1 = 1
, which is 5
.Code (Python):
def find_kth_largest_naive(nums, k):
nums.sort(reverse=True)
return nums[k-1]
Time Complexity: O(n log n) due to sorting.
Space Complexity: O(1) or O(n), depending on the sorting algorithm implementation. In-place sorts like heapsort use O(1), while others like merge sort use O(n).
We can use a min-heap data structure to efficiently find the kth largest element. The idea is to maintain a min-heap of size k
. Iterate through the array:
k
elements, add them to the min-heap.After processing all elements, the root of the min-heap will be the kth largest element.
Example:
nums = [3, 2, 1, 5, 6, 4], k = 2
k=2
elements: [3, 2]
. The min-heap will be [2, 3]
.5
: 5 > 2
, so remove 2
and add 5
. Heap: [3, 5]
.6
: 6 > 3
, so remove 3
and add 6
. Heap: [5, 6]
.4
: 4 < 5
, so do nothing.5
, which is the 2nd largest element.Code (Python):
import heapq
def find_kth_largest(nums, k):
min_heap = []
for num in nums:
if len(min_heap) < k:
heapq.heappush(min_heap, num)
elif num > min_heap[0]:
heapq.heapreplace(min_heap, num)
return min_heap[0]
Time Complexity: O(n log k). Building the initial heap takes O(k log k). Iterating through the rest of the array takes O((n-k) log k) in the worst case, for a total of O(k log k + (n-k) log k) = O(n log k).
Space Complexity: O(k) to store the min-heap.
None
, depending on the problem's requirements.k
is less than 1 or greater than the length of the array, we should throw an exception or return None
.For the average case, the Quickselect algorithm provides an O(n) time solution. It's based on the partitioning logic of Quicksort, but instead of recursively sorting both sides of the partition, it only recurses into the side that contains the kth largest element. This avoids unnecessary sorting, leading to better performance on average. However, the worst-case time complexity is O(n^2), which occurs when the pivot is consistently the smallest or largest element.
import random
def find_kth_largest_quickselect(nums, k):
def partition(l, r):
pivot_index = random.randint(l, r)
pivot = nums[pivot_index]
nums[r], nums[pivot_index] = nums[pivot_index], nums[r]
i = l
for j in range(l, r):
if nums[j] < pivot:
nums[i], nums[j] = nums[j], nums[i]
i += 1
nums[i], nums[r] = nums[r], nums[i]
return i
def quickselect(l, r, k_smallest):
if l == r:
return nums[l]
partition_index = partition(l, r)
if k_smallest == partition_index:
return nums[k_smallest]
elif k_smallest < partition_index:
return quickselect(l, partition_index - 1, k_smallest)
else:
return quickselect(partition_index + 1, r, k_smallest)
return quickselect(0, len(nums) - 1, len(nums) - k)
Time Complexity: Average: O(n), Worst: O(n^2) Space Complexity: O(1) - in-place