Given an integer array nums
and an integer k
, return the k
th largest element in the array. Note that it is the k
th largest element in the sorted order, not the k
th distinct element.
For example:
nums = [3,2,1,5,6,4], k = 2
should return 5
nums = [3,2,3,1,2,4,5,5,6], k = 4
should return 4
Can you solve it without sorting?
Constraints:
1 <= k <= nums.length <= 10^5
-10^4 <= nums[i] <= 10^4
Explain the time and space complexity of your solution. Consider edge cases like k=1, k=nums.length, and duplicate elements.
The most straightforward approach is to sort the array in descending order and then return the element at index k-1
. This gives us the kth largest element.
def findKthLargest_sorting(nums, k):
nums.sort(reverse=True)
return nums[k-1]
Time Complexity: O(n log n) due to the sorting operation.
Space Complexity: O(1) or O(n) depending on the sorting algorithm implementation. In-place sorting algorithms like heapsort have O(1) space complexity, while others might use O(n).
We can use a min-heap data structure to keep track of the k
largest elements encountered so far. As we iterate through the array, we add each element to the heap. If the size of the heap exceeds k
, we remove the smallest element from the heap (the root). After processing all the elements, the root of the min-heap will be the k
th largest element.
import heapq
def findKthLargest_heap(nums, k):
min_heap = []
for num in nums:
heapq.heappush(min_heap, num)
if len(min_heap) > k:
heapq.heappop(min_heap)
return min_heap[0]
Time Complexity: O(n log k), where n
is the number of elements in the array. We iterate through each element and potentially add it to the heap (log k operation).
Space Complexity: O(k) to store the k
largest elements in the min-heap.
The heap-based solution is generally more efficient than the sorting solution when k
is much smaller than n
. This is because it avoids sorting the entire array. The heap maintains only the k
largest elements seen so far, leading to better performance. The sorting approach, while simple, always incurs an O(n log n) cost regardless of the value of k
.