Given an integer array nums
and an integer k
, return the k
most frequent elements. You may return the answer in any order. For example:
Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]
Input: nums = [1], k = 1
Output: [1]
Constraints:
1 <= nums.length <= 10^5
-10^4 <= nums[i] <= 10^4
k
is in the range [1, the number of unique elements in the array]
. It is guaranteed that the answer is unique.Can you design an algorithm that solves this problem with a time complexity better than O(n log n), where n is the array's size? Describe the algorithm, explain its time and space complexity, and handle potential edge cases. Provide example code with comments.
Given an integer array nums
and an integer k
, return the k
most frequent elements. You may return the answer in any order.
k
elements from the sorted list.Code (Python)
def topKFrequent_naive(nums, k):
counts = {}
for n in nums:
counts[n] = counts.get(n, 0) + 1
sorted_counts = sorted(counts.items(), key=lambda item: item[1], reverse=True)
result = [item[0] for item in sorted_counts[:k]]
return result
Time Complexity: O(n log n) due to sorting the frequencies.
Space Complexity: O(n) to store the frequencies in the hash map.
k
. Iterate through the hash map entries. If the current element's frequency is greater than the smallest frequency in the heap, remove the smallest element and add the current element to the heap.k
most frequent elements. Extract these elements from the heap.Code (Python)
import heapq
def topKFrequent(nums, k):
counts = {}
for n in nums:
counts[n] = counts.get(n, 0) + 1
heap = []
for num, freq in counts.items():
if len(heap) == k:
if freq > heap[0][0]:
heapq.heapreplace(heap, (freq, num))
else:
heapq.heappush(heap, (freq, num))
result = [num for freq, num in heap]
return result
Time Complexity: O(n log k), where n is the number of elements in nums
.
Space Complexity: O(n) for the hash map to store the frequencies and O(k) for the heap.
k
is 0, return an empty list.k
is equal to the number of unique elements, return all unique elements.The heap-based approach is more efficient than the sorting-based approach when k
is significantly smaller than n
. The heap allows us to maintain the k
most frequent elements seen so far without needing to sort all the frequencies. The heapq
module in Python provides an efficient implementation of a min-heap.