Find the Substring With Maximum Cost

Medium

Asked by:

4 views

Topics:

ArraysStringsDynamic Programming

You are given a string s, a string chars of distinct characters and an integer array vals of the same length as chars.

The cost of the substring is the sum of the values of each character in the substring. The cost of an empty string is considered 0.

The value of the character is defined in the following way:

If the character is not in the string chars, then its value is its corresponding position (1-indexed) in the alphabet.
- For example, the value of 'a' is 1, the value of 'b' is 2, and so on. The value of 'z' is 26.
Otherwise, assuming i is the index where the character occurs in the string chars, then its value is vals[i].

Return the maximum cost among all substrings of the string s.

Example 1:

Input: s = "adaa", chars = "d", vals = [-1000]
Output: 2
Explanation: The value of the characters "a" and "d" is 1 and -1000 respectively.
The substring with the maximum cost is "aa" and its cost is 1 + 1 = 2.
It can be proven that 2 is the maximum cost.

Example 2:

Input: s = "abc", chars = "abc", vals = [-1,-1,-1]
Output: 0
Explanation: The value of the characters "a", "b" and "c" is -1, -1, and -1 respectively.
The substring with the maximum cost is the empty substring "" and its cost is 0.
It can be proven that 0 is the maximum cost.

Constraints:

1 <= s.length <= 10⁵
s consist of lowercase English letters.
1 <= chars.length <= 26
chars consist of distinct lowercase English letters.
vals.length == chars.length
-1000 <= vals[i] <= 1000

Try coding it in LeetCode

Solution

Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

What are the possible characters in the string `s` and the string `chars`?
What is the range for the values in the `values` array? Can they be negative, zero, or positive?
Is an empty string `s` a valid input, and if so, what should the return value be?
If there are multiple substrings with the same maximum cost, should I return any one of them, or is there a specific criterion for choosing between them (e.g., the shortest, the first encountered)?
What should the return value be if the cost of all possible substrings is negative?

Brute Force Solution

Approach

The goal is to find a small piece within a larger piece that gives us the biggest value, where each letter has an assigned value. The brute force approach just tries every possible small piece within the bigger piece and calculates its value to find the maximum.

Here's how the algorithm would work step-by-step:

Start by considering the very first letter of the large piece as the beginning of our small piece.
Now, look at small pieces starting with this first letter, increasing in size one letter at a time.
Calculate the value of each of these small pieces by adding up the value of each letter.
Remember the highest value you have found so far.
Move on to the second letter of the large piece and repeat the process, considering all small pieces that start with this second letter.
Continue doing this for every letter in the large piece, each time calculating the value of all possible small pieces that start from that letter.
After checking all possible small pieces, the highest value you remembered is the answer.

Code Implementation

def find_max_cost_substring_brute_force(input_string, character_costs):
    max_cost = float('-inf')

    # Iterate through all possible starting positions of substrings
    for start_index in range(len(input_string)): 

        current_substring_cost = 0
        # Iterate through all possible ending positions for substrings 
        for end_index in range(start_index, len(input_string)): 
            current_character = input_string[end_index]
            current_substring_cost += character_costs[current_character]

            # Update maximum cost if current substring has higher cost.
            if current_substring_cost > max_cost:
                max_cost = current_substring_cost

    # Handle the case where all substring costs are negative.
    if max_cost == float('-inf'):
        return 0

    return max_cost

Big(O) Analysis

Time Complexity

O(n²) – The algorithm iterates through each character of the input string of length 'n' as a potential starting point for a substring. For each starting position, it iterates through all possible ending positions to form substrings. In the worst case, this involves nested loops where the outer loop runs 'n' times, and the inner loop runs up to 'n' times for each outer loop iteration, calculating the substring cost. The total number of operations approximates n * (n+1)/2, which simplifies to O(n²). The time complexity is thus quadratic.

Space Complexity

O(1) – The provided algorithm iterates through all possible substrings, calculating their cost and updating a maximum cost variable. The dominant space usage is for storing the maximum cost found so far, the current substring cost being calculated, and loop indices. These variables occupy constant space, irrespective of the length (N) of the input string. Therefore, the auxiliary space complexity is O(1).

Optimal Solution

Approach

The problem asks to find a substring with the largest possible cost. We can solve this efficiently by keeping track of the current cost as we move through the string, resetting if it drops below zero, as a negative cost substring won't contribute to a larger overall cost.

Here's how the algorithm would work step-by-step:

Start at the beginning of the main string.
Calculate the cost for each character of the string based on the given letter values.
As you go from character to character, keep a running total of the substring cost.
If at any point the running cost becomes negative, reset it to zero and start a new substring.
Also, keep track of the largest running cost you've seen so far. This is the maximum cost you've found.
At the end, the largest running cost will be the cost of the substring with the maximum cost.

Code Implementation

def find_max_cost_substring(main_string, letter_values):
    maximum_cost = 0
    current_cost = 0

    for character in main_string:
        character_index = ord(character) - ord('a')
        character_cost = letter_values[character_index]

        current_cost += character_cost

        # If the current cost is negative, reset.
        if current_cost < 0:
            current_cost = 0

        # Keep track of the maximum cost found so far.
        if current_cost > maximum_cost:
            maximum_cost = current_cost

    return maximum_cost

Big(O) Analysis

Time Complexity

O(n) – The algorithm iterates through the main string once. Within the loop, the cost of each character is calculated and added to a running total. The running total is compared against zero and a maximum cost, but these comparisons take constant time. Therefore, the time complexity is directly proportional to the length of the input string (n), making it O(n).

Space Complexity

O(1) – The algorithm only uses a few variables to store the current cost and the maximum cost encountered so far. These variables consume a constant amount of memory regardless of the input string's length, which we can denote as N. No auxiliary data structures like arrays or hash maps are used that scale with the size of the input. Therefore, the space complexity is constant.

Edge Cases

Case	How to Handle
Empty string as input for 's'	Return an empty string immediately as no substring can be formed.
Empty string as input for 'chars' or 'values'	Return an empty string immediately since no costs can be calculated.
chars and values have different lengths	Throw an exception because the mapping of characters to values is inconsistent.
s contains characters not present in 'chars'	Treat characters not in 'chars' as having a cost of 0 to avoid errors.
All characters in s have negative costs, resulting in a maximum cost of less than 0.	Return an empty string in this case to indicate that no positive-cost substring exists or 0 if only positive or 0 substring cost is allowed.
Integer overflow when calculating cumulative costs	Use a larger data type (e.g., long) to store the cumulative cost to prevent overflow.
Maximum length string s with all positive cost characters, which could impact performance.	The solution should have at worst O(n) time complexity to efficiently handle large input strings.
Multiple substrings with the same maximum cost.	Return the first substring encountered that achieves the maximum cost, or define a tie-breaking mechanism (e.g., shortest substring).

Empty string as input for 's'

How to Handle:

Return an empty string immediately as no substring can be formed.

Empty string as input for 'chars' or 'values'

How to Handle:

Return an empty string immediately since no costs can be calculated.

chars and values have different lengths

How to Handle:

Throw an exception because the mapping of characters to values is inconsistent.

s contains characters not present in 'chars'

How to Handle:

Treat characters not in 'chars' as having a cost of 0 to avoid errors.

All characters in s have negative costs, resulting in a maximum cost of less than 0.

How to Handle:

Return an empty string in this case to indicate that no positive-cost substring exists or 0 if only positive or 0 substring cost is allowed.

Integer overflow when calculating cumulative costs

How to Handle:

Use a larger data type (e.g., long) to store the cumulative cost to prevent overflow.

Maximum length string s with all positive cost characters, which could impact performance.

How to Handle:

The solution should have at worst O(n) time complexity to efficiently handle large input strings.

Multiple substrings with the same maximum cost.

How to Handle:

Return the first substring encountered that achieves the maximum cost, or define a tie-breaking mechanism (e.g., shortest substring).