Taro Logo

Kth Largest Element in an Array

Medium
Verily logo
Verily
0 views
Topics:
ArraysTwo Pointers

Given an integer array nums and an integer k, return the kth largest element in the array.

Note that it is the kth largest element in the sorted order, not the kth distinct element.

Can you solve it without sorting?

Example 1:

Input: nums = [3,2,1,5,6,4], k = 2
Output: 5

Example 2:

Input: nums = [3,2,3,1,2,4,5,5,6], k = 4
Output: 4

Constraints:

  • 1 <= k <= nums.length <= 105
  • -104 <= nums[i] <= 104

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the size of the input array `nums`?
  2. What is the range of possible values within the `nums` array? Can I expect negative numbers, zeros, or very large numbers?
  3. Is `k` guaranteed to be a valid value, meaning it's always greater than 0 and less than or equal to the length of `nums`?
  4. Does the order of elements in `nums` matter for the output, or do I only care about the value of the kth largest element?
  5. Are there any specific memory constraints I should be aware of?

Brute Force Solution

Approach

The brute force approach to finding the Kth largest element means going through all possible ways to identify the largest, second largest, all the way to the Kth largest. We essentially simulate ranking everything from scratch. This guarantees we find the right element, but it might take a while.

Here's how the algorithm would work step-by-step:

  1. First, find the very largest element in the entire collection.
  2. Then, find the largest element from what is left after you take out the very largest element you already found. This gives you the second largest.
  3. Keep repeating the previous step, each time finding the largest remaining element. This gives you the third largest, fourth largest, and so on.
  4. Do this until you've found the largest K elements. The very last element you found in this process is the Kth largest element in the collection.

Code Implementation

def find_kth_largest_brute_force(numbers, k_value):    working_numbers = numbers[:] # Make a copy to avoid modifying the original input
    for _ in range(k_value):        # Find the largest element in the current working list        largest_index = 0
        for current_index in range(1, len(working_numbers)):            if working_numbers[current_index] > working_numbers[largest_index]:                largest_index = current_index
        # Store the largest number to return at the end        largest_number = working_numbers[largest_index]

        # Remove the largest element from the list.        del working_numbers[largest_index]

    #After removing k - 1 largest, return the largest    return largest_number

Big(O) Analysis

Time Complexity
O(n*k)The algorithm iterates k times to find the kth largest element. In each iteration, it searches for the maximum element in the remaining unsorted portion of the array, which takes O(n) time, where n is the size of the remaining array which is less than or equal to the size of original array. Since we do this k times, the overall time complexity becomes O(n*k). In the worst-case scenario where k is close to n (e.g., finding the median), this approaches O(n^2), but if k is a small constant, it's closer to O(n).
Space Complexity
O(1)The provided plain English explanation repeatedly finds the largest element and removes it. Although not explicitly stated, to remove the largest element efficiently without creating a copy, we could replace it with a placeholder value (e.g., negative infinity) or use its index directly. This process doesn't create any auxiliary data structures whose size depends on the input size N. We only need to keep track of a few variables like the current largest element and its index. Therefore, the space complexity is constant.

Optimal Solution

Approach

To quickly find the kth largest element, we'll use a technique similar to how we sort things, but without sorting the entire list. This approach cleverly partitions the list around a chosen value, allowing us to efficiently narrow down the search area.

Here's how the algorithm would work step-by-step:

  1. Pick an element from the list. This will be our 'pivot'.
  2. Rearrange the list so that elements larger than the pivot are on one side, and elements smaller than the pivot are on the other side. The pivot ends up in its final sorted position.
  3. Determine the position of the pivot. If the pivot is in the kth position, we've found our answer.
  4. If the pivot's position is earlier than the kth position, search only in the larger part of the list. If the pivot's position is later than the kth position, search only in the smaller part of the list.
  5. Repeat the process of picking a pivot and partitioning, focusing on the relevant portion of the list until you find the kth largest element.

Code Implementation

def find_kth_largest(numbers, k):
    kth_index = len(numbers) - k

    def partition(start_index, end_index):
        pivot = numbers[end_index]
        partition_index = start_index

        for i in range(start_index, end_index):
            if numbers[i] <= pivot:
                numbers[i], numbers[partition_index] = numbers[partition_index], numbers[i]
                partition_index += 1

        numbers[partition_index], numbers[end_index] = numbers[end_index], numbers[partition_index]
        return partition_index

    start_index = 0
    end_index = len(numbers) - 1

    while True:
        # Partition the array around a chosen pivot
        pivot_index = partition(start_index, end_index)

        if pivot_index == kth_index:
            # We have found the kth largest element
            return numbers[pivot_index]
        elif pivot_index < kth_index:
            # Search in the right partition
            start_index = pivot_index + 1
        else:
            # Search in the left partition
            end_index = pivot_index - 1

Big(O) Analysis

Time Complexity
O(n)The best and average case time complexity is achieved when the pivot consistently divides the array into roughly equal halves during each partitioning step. In this ideal scenario, we eliminate half of the remaining elements in each recursive call. This results in examining n + n/2 + n/4 + ... elements, which is a geometric series that converges to O(n). The worst case arises when the pivot consistently picks the smallest or largest element, resulting in partitions of size 1 and n-1. This leads to O(n^2) complexity, however, the randomized selection of pivots means the average case runtime is O(n).
Space Complexity
O(log N) to O(N)The algorithm's space complexity arises primarily from the recursion depth. In the best-case scenario where the pivot consistently divides the array into roughly equal halves, the maximum recursion depth is logarithmic, leading to O(log N) space for the call stack. However, in the worst-case scenario where the pivot consistently selects the smallest or largest element, the recursion depth can reach N, resulting in O(N) space for the call stack. The plain English explanation details a recursive process focused on partitioning sections of the initial list.

Edge Cases

CaseHow to Handle
nums is null or emptyReturn null or throw an IllegalArgumentException since the kth largest element cannot be found.
k is less than or equal to 0, or k is greater than the length of numsThrow an IllegalArgumentException as k is outside the valid range of array indices.
nums contains only one elementReturn the only element if k is 1; otherwise, handle as an invalid k value.
nums contains all identical valuesThe solution should correctly return the repeated value if k is within the valid range.
nums contains duplicate valuesThe solution should correctly identify the kth largest element regardless of duplicate values.
nums contains very large or very small integers (potential for integer overflow if not handled carefully)Use appropriate data types (e.g., long) and comparison methods to prevent integer overflow during calculations.
Large array size; performance considerations (e.g., using a min-heap or quickselect for better time complexity)Ensure that the algorithm used scales well to large input sizes to avoid exceeding time limits.
nums contains negative numbers and zerosThe solution should correctly handle both negative numbers and zeros when determining the kth largest element.