Taro Logo

Find Subsequence of Length K With the Largest Sum

Easy
Amazon logo
Amazon
2 views
Topics:
ArraysGreedy Algorithms

You are given an integer array nums and an integer k. You want to find a subsequence of nums of length k that has the largest sum.

Return any such subsequence as an integer array of length k.

A subsequence is an array that can be derived from another array by deleting some or no elements without changing the order of the remaining elements.

Example 1:

Input: nums = [2,1,3,3], k = 2
Output: [3,3]
Explanation:
The subsequence has the largest sum of 3 + 3 = 6.

Example 2:

Input: nums = [-1,-2,3,4], k = 3
Output: [-1,3,4]
Explanation: 
The subsequence has the largest sum of -1 + 3 + 4 = 6.

Example 3:

Input: nums = [3,4,3,3], k = 2
Output: [3,4]
Explanation:
The subsequence has the largest sum of 3 + 4 = 7. 
Another possible subsequence is [4, 3].

Constraints:

  • 1 <= nums.length <= 1000
  • -10^5 <= nums[i] <= 10^5
  • 1 <= k <= nums.length

Can you write a function that efficiently finds a subsequence of length k with the largest sum?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the size of the input array, and the value of K?
  2. Can the input array contain negative numbers, zeros, or floating-point numbers?
  3. If there are multiple subsequences with the same largest sum, can I return any one of them?
  4. Should the subsequence returned be sorted, or can it be in any order?
  5. Is K guaranteed to be a valid length, i.e., is it always greater than 0 and less than or equal to the length of the input array?

Brute Force Solution

Approach

The brute force approach to finding the subsequence with the largest sum involves checking every possible combination of numbers from the original sequence. We create every possible group of the required size. Then, we compare the sums of all these groups to find the largest.

Here's how the algorithm would work step-by-step:

  1. First, list all possible groups of numbers you can pick from the original list, making sure each group has the exact number of elements we need.
  2. Calculate the sum of the numbers in each of these groups.
  3. Compare the sums of all the groups.
  4. The group with the highest sum is the subsequence we're looking for.

Code Implementation

def find_largest_sum_subsequence_brute_force(numbers, subsequence_length):
    all_subsequences = []
    largest_sum = float('-inf')
    largest_sum_subsequence = []

    # Generate all possible subsequences of the specified length
    def generate_subsequences(current_subsequence, start_index):
        if len(current_subsequence) == subsequence_length:
            all_subsequences.append(current_subsequence[:])
            return
        
        for i in range(start_index, len(numbers)):

            # Choose element for subsequence
            current_subsequence.append(numbers[i])
            generate_subsequences(current_subsequence, i + 1)

            # Backtrack to explore other possibilities
            current_subsequence.pop()
    
    generate_subsequences([], 0)
    
    # Iterate through each subsequence to find the one with the largest sum
    for subsequence in all_subsequences:
        current_sum = sum(subsequence)

        # Update the largest sum if necessary
        if current_sum > largest_sum:

            # Found a new largest sum subsequence
            largest_sum = current_sum
            largest_sum_subsequence = subsequence

    return largest_sum_subsequence

Big(O) Analysis

Time Complexity
O(nCk)The provided solution involves generating all possible subsequences of length k from an array of size n. The number of such subsequences is given by the binomial coefficient n choose k, which is denoted as nCk or (n k). Calculating the sum for each subsequence takes O(k) time. Therefore, the overall time complexity is dominated by the generation of all possible subsequences, resulting in a time complexity of O(nCk * k). However, since the generation of the subsequences is the most significant factor, and assuming k is significantly smaller than n, the time complexity can be represented as O(nCk), where nCk represents the number of combinations of choosing k elements from n elements.
Space Complexity
O(C(N, K))The brute force approach generates all possible subsequences of length K from the input array of size N. This means creating a collection of C(N, K) subsequences, where C(N, K) is the binomial coefficient representing the number of combinations of choosing K elements from N. To store these combinations, auxiliary space proportional to C(N, K) is required. Therefore, the space complexity is O(C(N, K)).

Optimal Solution

Approach

The trick here is to identify the `k` largest numbers in the list without explicitly sorting everything. We accomplish this by focusing on the *positions* of those top numbers and then carefully reconstructing our desired subsequence.

Here's how the algorithm would work step-by-step:

  1. First, find the positions of the `k` largest numbers in the given list.
  2. Next, sort those positions so they're in the original order they appeared in the list.
  3. Finally, create a new list containing only the numbers at those sorted positions. This new list is the subsequence with the largest possible sum.

Code Implementation

def find_subsequence(number_list, subsequence_length):
    number_list_length = len(number_list)
    if subsequence_length > number_list_length:
        return []

    # Find indices of k largest elements.
    largest_indices = sorted(range(number_list_length),
                             key=lambda i: number_list[i], reverse=True)[:subsequence_length]

    # Sort indices to maintain original order.
    largest_indices.sort()

    # Construct the subsequence using the sorted indices.
    result_subsequence = [number_list[i] for i in largest_indices]

    return result_subsequence

Big(O) Analysis

Time Complexity
O(n log k)Finding the positions of the k largest numbers can be efficiently done using a min-heap of size k, iterating through the n elements of the input list once. Each element is compared with the smallest element in the heap, and if larger, replaces it, which takes O(log k) time. Therefore, finding these k largest numbers' indices takes O(n log k). Sorting these k indices takes O(k log k) time. Since k <= n, O(n log k) dominates O(k log k). Finally, creating the subsequence from these sorted indices takes O(k) time which is less than O(n log k), resulting in a total time complexity of O(n log k).
Space Complexity
O(K)The algorithm uses a list to store the positions of the k largest numbers, requiring O(K) space. Sorting these positions also requires auxiliary space, but since we are sorting k elements, the sorting operation can take at most O(K) space as well, for example with mergesort, or O(1) with in-place sorts such as heapsort, so it won't affect the overall space complexity. Finally, creating the new list containing the numbers at the sorted positions requires O(K) space. Therefore, the overall auxiliary space complexity is O(K).

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn an empty list or throw an IllegalArgumentException as there's no input to process
k is zero or negativeReturn an empty list since a subsequence of length zero has a sum of zero which is considered the largest or throw an IllegalArgumentException.
k is greater than the length of the input arrayReturn the entire input array or throw an IllegalArgumentException, as it is the only possible subsequence
Input array contains all negative numbers and k is equal to the array length.The entire array should be returned, as it's the only possible subsequence of length k
Input array contains all same numbersReturn the first k elements of the array.
Array contains integer overflow potential with large values and large k.Use a data type capable of storing larger values, like long, for intermediate sums to prevent overflow
Input array contains a mix of very large positive and very large negative numbers.Standard sorting and summing will work correctly, as long as overflow is handled separately.
k = 1 (finding the single largest element)Return a list containing only the maximum element of the array.