Given an array of strings words
and an integer k
, return the k
most frequent strings. Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.
For example:
words = ["i","love","leetcode","i","love","coding"]
, k = 2
should return ["i","love"]
words = ["the","day","is","sunny","the","the","the","sunny","is","is"]
, k = 4
should return ["the","is","sunny","day"]
Could you solve it in O(n log(k))
time and O(n)
extra space?
k
words from the sorted list.Code (Python):
from collections import Counter
def topKFrequent(words, k):
count = Counter(words)
sorted_words = sorted(count.items(), key=lambda x: (-x[1], x[0]))
return [word for word, freq in sorted_words[:k]]
Time Complexity: O(n log n) due to sorting, where n is the number of words.
Space Complexity: O(n) to store the counts in the hash map.
Using a min-heap (priority queue) to store the k
most frequent words.
k
. The heap will store tuples of (frequency, word). We use a min-heap because we want to easily discard elements with lower frequency.k
, add the word to the heap.k
most frequent words. Convert the heap to a list and sort it, then reverse it to get the words in the desired order.Code (Python):
import heapq
from collections import Counter
def topKFrequent(words, k):
count = Counter(words)
heap = []
for word, freq in count.items():
heapq.heappush(heap, (freq, word))
if len(heap) > k:
heapq.heappop(heap)
result = [word for freq, word in heap]
result.sort(key=lambda w: (-count[w], w))
return result[:k]
Code (Python - Improved):
import heapq
from collections import Counter
def topKFrequent(words, k):
count = Counter(words)
heap = []
for word, freq in count.items():
heapq.heappush(heap, (-freq, word)) # Negate frequency for max-heap behavior
result = [heapq.heappop(heap)[1] for _ in range(k)]
return result
Time Complexity: O(n log k), where n is the number of words. Building the counter takes O(n), and pushing/popping from the heap takes O(log k) and we do this at most n times.
Space Complexity: O(n) to store the counts in the hash map and O(k) for the heap.
k
is greater than the number of unique words: Return all unique words.k
words in lexicographical order.