Taro Logo

Top K Frequent Elements

Medium
Meta logo
Meta
4 views
Topics:
ArraysStacks

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order. The time complexity of your solution should be better than O(n log n).

For example:

  • nums = [1,1,1,2,2,3], k = 2 should return [1,2]
  • nums = [1], k = 1 should return [1]

Constraints:

  • 1 <= nums.length <= 10^5
  • -10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array]
  • It is guaranteed that the answer is unique.

Explain your approach, including the time and space complexity. Also, discuss any edge cases that might arise and how your solution handles them.

Solution


Naive Solution

The naive solution to find the k most frequent elements involves counting the frequency of each element in the input array and then sorting the elements based on their frequencies. After sorting, we can pick the top k elements.

  1. Count Frequencies: Iterate through the input array nums and use a hash map (dictionary) to store the frequency of each element.
  2. Sort by Frequency: Sort the hash map's elements based on their frequencies in descending order.
  3. Select Top K: Pick the first k elements from the sorted list.

Example:

nums = [1, 1, 1, 2, 2, 3], k = 2

  1. Frequencies: {1: 3, 2: 2, 3: 1}
  2. Sorted: [(1, 3), (2, 2), (3, 1)]
  3. Top 2: [1, 2]

Code (Python)

from collections import Counter

def topKFrequent_naive(nums, k):
    # Count frequencies
    counts = Counter(nums)
    
    # Sort by frequency
    sorted_counts = sorted(counts.items(), key=lambda item: item[1], reverse=True)
    
    # Select top k
    result = [item[0] for item in sorted_counts[:k]]
    
    return result

Time Complexity

  • Counting frequencies: O(n)
  • Sorting: O(n log n)
  • Selecting top k: O(k)

Overall: O(n log n)

Space Complexity

O(n) to store the frequencies in the hash map.

Optimal Solution (Using Heap)

To achieve a time complexity better than O(n log n), we can use a min-heap (priority queue) to keep track of the k most frequent elements. This approach avoids sorting all elements.

  1. Count Frequencies: Same as the naive approach, use a hash map to count the frequency of each element.
  2. Build Min-Heap: Iterate through the hash map's elements and add them to a min-heap of size k. If the heap size exceeds k, remove the element with the smallest frequency from the heap.
  3. Extract Result: The heap will contain the k most frequent elements. Extract these elements into a result list.

Code (Python)

import heapq
from collections import Counter

def topKFrequent(nums, k):
    # Count frequencies
    counts = Counter(nums)
    
    # Build min-heap
    heap = []
    for num, freq in counts.items():
        heapq.heappush(heap, (freq, num))
        if len(heap) > k:
            heapq.heappop(heap)
    
    # Extract result
    result = [num for freq, num in heap]
    return result

Time Complexity

  • Counting frequencies: O(n)
  • Building min-heap: O(n log k)
  • Extracting result: O(k)

Overall: O(n log k)

Since k <= n, O(n log k) is better than O(n log n) when k is significantly smaller than n. In the worst case when k == n, it becomes O(n log n).

Space Complexity

O(n) to store the frequencies in the hash map and O(k) to store the heap.

Edge Cases

  • Empty Array: If the input array nums is empty, return an empty list.
  • k = 0: If k is 0, return an empty list.
  • k = Number of Unique Elements: If k is equal to the number of unique elements, return all unique elements.
  • Single Element Array: If the input array contains only one element, return that element in a list.