You are part of a university admissions office and need to keep track of the kth
highest test score from applicants in real-time. This helps to determine cut-off marks for interviews and admissions dynamically as new applicants submit their scores.
You are tasked to implement a class which, for a given integer k
, maintains a stream of test scores and continuously returns the k
th highest test score after a new score has been submitted. More specifically, we are looking for the k
th highest score in the sorted list of all scores.
Implement the KthLargest
class:
KthLargest(int k, int[] nums)
Initializes the object with the integer k
and the stream of test scores nums
.int add(int val)
Adds a new test score val
to the stream and returns the element representing the kth
largest element in the pool of test scores so far.Example 1:
Input:
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
Output: [null, 4, 5, 5, 8, 8]
Explanation:
KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3); // return 4
kthLargest.add(5); // return 5
kthLargest.add(10); // return 5
kthLargest.add(9); // return 8
kthLargest.add(4); // return 8
Example 2:
Input:
["KthLargest", "add", "add", "add", "add"]
[[4, [7, 7, 7, 7, 8, 3]], [2], [10], [9], [9]]
Output: [null, 7, 7, 7, 8]
Explanation:
KthLargest kthLargest = new KthLargest(4, [7, 7, 7, 7, 8, 3]);Constraints:
0 <= nums.length <= 104
1 <= k <= nums.length + 1
-104 <= nums[i] <= 104
-104 <= val <= 104
104
calls will be made to add
.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
To find the kth largest element in a stream using a brute force approach, we essentially keep track of all the elements we've seen so far. Whenever a new element comes in, we consider all the elements seen so far and find the kth largest among them.
Here's how the algorithm would work step-by-step:
class KthLargest:
def __init__(self, k_value, initial_numbers):
self.k_value = k_value
self.all_numbers = initial_numbers
def add(self, new_number):
# Add the new number to the list of all numbers seen.
self.all_numbers.append(new_number)
# Sort the list to easily find the kth largest element
self.all_numbers.sort(reverse=True)
# Return the kth largest element.
return self.all_numbers[self.k_value - 1]
The challenge is to efficiently track the kth largest number as new numbers arrive. Instead of sorting all numbers every time, we maintain only the k largest values seen so far in a special ordered structure.
Here's how the algorithm would work step-by-step:
import heapq
class KthLargest:
def __init__(self, k_value: int, initial_numbers: list[int]):
self.k_value = k_value
self.min_heap = initial_numbers
heapq.heapify(self.min_heap)
# Ensure heap contains only the k largest
while len(self.min_heap) > k_value:
heapq.heappop(self.min_heap)
def add(self, new_number: int) -> int:
# Only add if larger than smallest element
if len(self.min_heap) < self.k_value or new_number > self.min_heap[0]:
heapq.heappush(self.min_heap, new_number)
# Maintain size of k
if len(self.min_heap) > self.k_value:
heapq.heappop(self.min_heap)
# Heap root will be the kth largest
return self.min_heap[0]
return self.min_heap[0]
Case | How to Handle |
---|---|
k is zero or negative | Throw an IllegalArgumentException or return null since it is impossible to find a kth largest element when k is non-positive. |
nums is null or empty and add is called multiple times before reaching k elements | Maintain a priority queue and return null or the smallest element in the queue until the queue size equals k. |
k is greater than the initial size of nums | Initialize the priority queue with the available elements and proceed with the add operations, returning null or the smallest element in the queue if the number of added elements + initial elements is less than k. |
Input stream contains large numbers, potentially leading to integer overflow if not handled carefully. | Use long data type for storing numbers if needed and be mindful of potential overflows during calculations. |
All elements in nums and the stream are identical | The min-heap priority queue will correctly maintain the k identical elements and return the kth largest one. |
k is a very large number approaching the maximum possible stream size. | The solution should scale efficiently using a min-heap, avoiding performance degradation as k increases, and potentially hitting memory constraints. |
nums contains duplicate numbers with extreme boundary values (min and max integer values). | The min-heap priority queue approach can handle duplicates, including extreme values, correctly since it relies on comparisons rather than special handling. |
Memory usage becomes excessive with a very large stream and large k. | Consider using an external sorting algorithm or techniques for handling very large data streams if memory becomes a bottleneck. |