Taro Logo

Top K Frequent Elements

Medium
Apple logo
Apple
Topics:
ArraysStacks

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

For example:

  • nums = [1,1,1,2,2,3], k = 2 should return [1, 2] (or [2, 1]).
  • nums = [1], k = 1 should return [1].

Here are some constraints to consider:

  • 1 <= nums.length <= 10^5
  • -10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique.

Can you write a function to solve this problem efficiently? What is the time and space complexity of your solution? Consider edge cases such as an empty input array or when k is equal to the number of unique elements. Finally, consider the follow-up question: Can you design an algorithm with a time complexity better than O(n log n), where n is the array's size?

Solution


Brute Force Solution

A brute force approach involves counting the frequency of each element in the array and then sorting the elements based on their frequencies. Finally, we pick the top k elements.

  1. Count Frequencies: Iterate through the input array nums and use a hash map (dictionary) to store the frequency of each element.
  2. Sort by Frequency: Create a list of tuples (element, frequency) from the hash map and sort this list in descending order based on frequency.
  3. Select Top k: Pick the first k elements from the sorted list.

Big O Analysis:

  • Time Complexity: O(n log n), where n is the number of elements in nums. This is primarily due to the sorting step.
  • Space Complexity: O(n), as we use a hash map to store the frequencies of all elements.
from collections import Counter

def top_k_frequent_brute_force(nums, k):
    # Count frequencies using Counter
    counts = Counter(nums)
    
    # Sort by frequency
    sorted_counts = sorted(counts.items(), key=lambda item: item[1], reverse=True)
    
    # Get the top k elements
    top_k = [item[0] for item in sorted_counts[:k]]
    
    return top_k

Optimal Solution using a Heap

A more efficient solution uses a min-heap to keep track of the k most frequent elements. This avoids sorting all elements and provides a better time complexity.

  1. Count Frequencies: Same as the brute force approach, use a hash map to store the frequency of each element.
  2. Heapify: Create a min-heap of size k. Iterate through the hash map. If the current element's frequency is greater than the smallest frequency in the heap, remove the smallest element and add the current element.
  3. Extract Top k: Extract elements from the heap to get the top k frequent elements.

Big O Analysis:

  • Time Complexity: O(n log k), where n is the number of elements in nums. Building the frequency map takes O(n) time, and each heap operation takes O(log k) time. In the worst case, we might iterate through all n elements.
  • Space Complexity: O(n), for the hash map and the heap (which stores at most k elements, so it can also be seen as O(k)).
import heapq
from collections import Counter

def top_k_frequent_heap(nums, k):
    # Count frequencies using Counter
    counts = Counter(nums)
    
    # Create a min-heap
    heap = []
    for num, freq in counts.items():
        heapq.heappush(heap, (freq, num))
        if len(heap) > k:
            heapq.heappop(heap)
    
    # Extract the top k elements from the heap
    top_k = [num for freq, num in heap]
    top_k.reverse()
    return top_k

Edge Cases:

  1. Empty Array: If the input array is empty, return an empty list.
  2. k = 0: If k is 0, return an empty list.
  3. k = number of unique elements: Return all unique elements.
  4. Single element repeated: Example nums = [1,1,1,1], k = 1. The result should be [1]
def top_k_frequent(nums, k):
    if not nums or k <= 0:
        return []
        
    return top_k_frequent_heap(nums, k)