Taro Logo

Kth Largest Element in an Array

Medium
Meta logo
Meta
13 views
Topics:
ArraysGreedy Algorithms

Given an integer array nums and an integer k, return the kth largest element in the array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

For example:

  • nums = [3,2,1,5,6,4], k = 2 should return 5
  • nums = [3,2,3,1,2,4,5,5,6], k = 4 should return 4

Can you solve it without sorting?

Constraints:

  • 1 <= k <= nums.length <= 10^5
  • -10^4 <= nums[i] <= 10^4

Explain the time and space complexity of your solution. Consider edge cases like k=1, k=nums.length, and duplicate elements.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the size of the input array `nums` and the value of `k`?
  2. Can the input array `nums` contain negative numbers, zero, or floating-point numbers?
  3. Are there any duplicate numbers in the input array `nums`? If so, how should they be handled?
  4. Is `k` guaranteed to be a valid value (i.e., 1 <= k <= nums.length)?
  5. What should I return if the input array `nums` is empty or null?

Brute Force Solution

Approach

To find the Kth largest element, we can sort all the elements from largest to smallest. Then we just pick the element that's in the Kth position. This method doesn't require any clever tricks, just sorting.

Here's how the algorithm would work step-by-step:

  1. First, put all the numbers from the given group in order, from largest to smallest.
  2. Then, count down to the Kth number in that ordered list.
  3. The number you land on is the Kth largest.

Code Implementation

def find_kth_largest(numbers, k_value): 
    # Sort the array in descending order.
    sorted_numbers = sorted(numbers, reverse=True)

    # K_value is 1-indexed, adjust to 0-indexed.
    kth_largest_index = k_value - 1

    # Access the element at the adjusted index.
    return sorted_numbers[kth_largest_index]

Big(O) Analysis

Time Complexity
O(n log n)The described approach relies on sorting the input array of size n from largest to smallest. Common efficient sorting algorithms, such as merge sort or quicksort, have a time complexity of O(n log n). After the array is sorted, accessing the Kth element takes constant time, O(1). Therefore, the dominant operation is the sorting step, making the overall time complexity O(n log n).
Space Complexity
O(N)The provided solution sorts the input array of size N. Most sorting algorithms (like merge sort or quicksort, if implemented to sort in place is not guaranteed) require auxiliary space. In the worst case, merge sort uses an auxiliary array of size N to perform the merging operation. Therefore, the space complexity is proportional to the input size N.

Optimal Solution

Approach

To efficiently find the kth largest element, we use a process similar to sorting, but we only focus on the part of the data that helps us find the answer. We repeatedly divide the data into smaller portions, eliminating the portions that cannot possibly contain the kth largest element.

Here's how the algorithm would work step-by-step:

  1. Pick an element from the data. This element will act as a dividing point.
  2. Arrange the data so that all elements larger than the dividing point are on one side, and all elements smaller are on the other side.
  3. Figure out the position of the dividing point element. If the position is exactly what we're looking for (the kth largest position), we're done!
  4. If the position of the dividing point is too high, repeat the process only on the portion of the data with larger elements.
  5. If the position of the dividing point is too low, repeat the process only on the portion of the data with smaller elements.
  6. Keep repeating this division and elimination process until the dividing point element is at the kth largest position.

Code Implementation

def find_kth_largest(nums, k):
    def partition(numbers, left_index, right_index):
        pivot_element = numbers[right_index]
        i = left_index

        for j in range(left_index, right_index):
            if numbers[j] >= pivot_element:
                numbers[i], numbers[j] = numbers[j], numbers[i]
                i += 1

        numbers[i], numbers[right_index] = numbers[right_index], numbers[i]
        return i

    def quickselect(numbers, left_index, right_index, k_index):
        # Only one element, return it.
        if left_index == right_index:
            return numbers[left_index]

        pivot_index = partition(numbers, left_index, right_index)

        # Found the kth largest element.
        if k_index == pivot_index:
            return numbers[k_index]

        # Search left subarray.
        elif k_index < pivot_index:
            return quickselect(numbers, left_index, pivot_index - 1, k_index)

        # Search right subarray.
        else:
            return quickselect(numbers, pivot_index + 1, right_index, k_index)

    array_length = len(nums)

    # Convert k to index from start
    k_index_from_start = k -1
    return quickselect(nums, 0, array_length - 1, array_length - k_index_from_start - 1)

Big(O) Analysis

Time Complexity
O(n)The algorithm's time complexity is determined by how many times we need to partition the array. In the best and average cases, the pivot selected in each partition step divides the array nearly in half. This leads to a roughly halved problem size with each iteration, similar to binary search. Hence, on average, we perform a linear amount of work, leading to a time complexity of O(n). While the worst-case scenario can degrade to O(n^2) if the pivot is consistently the smallest or largest element, a randomized pivot selection helps avoid this.
Space Complexity
O(log N) or O(N)The algorithm utilizes a partitioning process which can be implemented either iteratively or recursively. In the recursive implementation, the space complexity depends on the depth of the recursion, which on average is O(log N) due to the halving of the search space, but in the worst case (unbalanced partitions) it can be O(N). If implemented iteratively, no extra space beyond a few variables is needed, resulting in O(1) space complexity; however, the prompt suggests a recursive approach. Thus, considering the recursion stack, the space complexity is O(log N) in the average case and O(N) in the worst case.

Edge Cases

CaseHow to Handle
Empty arrayReturn null or throw an exception as there is no kth largest element in an empty array.
k is less than or equal to 0Throw an exception or return null as k must be a positive integer to represent the kth largest.
k is greater than the array lengthThrow an exception or return null as the kth largest element cannot exist if k exceeds the array size.
Array contains duplicate numbersThe sorting or heap-based solutions handle duplicates correctly as they consider the order, not distinctness.
Array contains negative numbersThe comparison operations in sorting or heap-based approaches work correctly with negative numbers.
Array contains all identical numbersThe sorting or heap-based solutions will still identify the correct kth largest element (which will be the repeated value).
Large array size approaching memory limitsConsider using an in-place partitioning algorithm like quickselect to minimize memory usage.
Integer overflow during calculations (if applicable)Use appropriate data types (e.g., long) to prevent integer overflow during calculations, if any are performed.