Given an integer array nums
and an integer k
, return the k
most frequent elements. You may return the answer in any order. The time complexity of your solution should be better than O(n log n).
For example:
nums = [1,1,1,2,2,3], k = 2
should return [1,2]
nums = [1], k = 1
should return [1]
Constraints:
1 <= nums.length <= 10^5
-10^4 <= nums[i] <= 10^4
k
is in the range [1, the number of unique elements in the array]
Explain your approach, including the time and space complexity. Also, discuss any edge cases that might arise and how your solution handles them.
The naive solution to find the k
most frequent elements involves counting the frequency of each element in the input array and then sorting the elements based on their frequencies. After sorting, we can pick the top k
elements.
nums
and use a hash map (dictionary) to store the frequency of each element.k
elements from the sorted list.Example:
nums = [1, 1, 1, 2, 2, 3], k = 2
{1: 3, 2: 2, 3: 1}
[(1, 3), (2, 2), (3, 1)]
[1, 2]
from collections import Counter
def topKFrequent_naive(nums, k):
# Count frequencies
counts = Counter(nums)
# Sort by frequency
sorted_counts = sorted(counts.items(), key=lambda item: item[1], reverse=True)
# Select top k
result = [item[0] for item in sorted_counts[:k]]
return result
Overall: O(n log n)
O(n) to store the frequencies in the hash map.
To achieve a time complexity better than O(n log n), we can use a min-heap (priority queue) to keep track of the k
most frequent elements. This approach avoids sorting all elements.
k
. If the heap size exceeds k
, remove the element with the smallest frequency from the heap.k
most frequent elements. Extract these elements into a result list.import heapq
from collections import Counter
def topKFrequent(nums, k):
# Count frequencies
counts = Counter(nums)
# Build min-heap
heap = []
for num, freq in counts.items():
heapq.heappush(heap, (freq, num))
if len(heap) > k:
heapq.heappop(heap)
# Extract result
result = [num for freq, num in heap]
return result
Overall: O(n log k)
Since k <= n, O(n log k) is better than O(n log n) when k is significantly smaller than n. In the worst case when k == n, it becomes O(n log n).
O(n) to store the frequencies in the hash map and O(k) to store the heap.
nums
is empty, return an empty list.k
is 0, return an empty list.k
is equal to the number of unique elements, return all unique elements.