Taro Logo

Minimum ASCII Delete Sum for Two Strings

Medium
Google logo
Google
1 view
Topics:
StringsDynamic Programming

Given two strings s1 and s2, return the lowest ASCII sum of deleted characters to make two strings equal.

Example 1:

Input: s1 = "sea", s2 = "eat"
Output: 231
Explanation: Deleting "s" from "sea" adds the ASCII value of "s" (115) to the sum.
Deleting "t" from "eat" adds 116 to the sum.
At the end, both strings are equal, and 115 + 116 = 231 is the minimum sum possible to achieve this.

Example 2:

Input: s1 = "delete", s2 = "leet"
Output: 403
Explanation: Deleting "dee" from "delete" to turn the string into "let",
adds 100[d] + 101[e] + 101[e] to the sum.
Deleting "e" from "leet" adds 101[e] to the sum.
At the end, both strings are equal to "let", and the answer is 100+101+101+101 = 403.
If instead we turned both strings into "lee" or "eet", we would get answers of 433 or 417, which are higher.

Constraints:

  1. 1 <= s1.length, s2.length <= 1000
  2. s1 and s2 consist of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the input strings `s1` and `s2` be empty or null?
  2. What is the maximum length of the input strings `s1` and `s2`? Are there any length constraints I should consider?
  3. Are the characters in the strings guaranteed to be ASCII characters, or should I expect Unicode characters with potentially larger ASCII values?
  4. If the strings are identical, should I return 0?
  5. If it's impossible to make the strings identical by deleting characters, is there a specific value I should return (e.g., -1) or should I calculate the sum of ASCII values for all characters in both strings?

Brute Force Solution

Approach

We need to find the smallest sum of character values to delete from two strings so they become identical. The brute force way is to consider all possible deletions from both strings, comparing the results, and picking the option with the lowest sum of deleted character values.

Here's how the algorithm would work step-by-step:

  1. Think about all the ways you could remove characters from the first string.
  2. For each of those possibilities, consider all the ways you could remove characters from the second string.
  3. After deleting characters from both strings, check if the resulting strings are the same.
  4. If they are the same, calculate the total value of the characters you deleted from both original strings.
  5. Keep track of the smallest total value you've found so far.
  6. Repeat this process for every possible combination of deletions you can make from both strings.
  7. The smallest total value you kept track of is the answer.

Code Implementation

def minimum_ascii_delete_sum_brute_force(string1, string2):
    minimum_delete_sum = float('inf')

    def calculate_delete_sum(deleted_characters):
        delete_sum = 0
        for character in deleted_characters:
            delete_sum += ord(character)
        return delete_sum

    def find_minimum_delete_sum(index_string1, index_string2, current_string1, current_string2, deleted_characters):
        nonlocal minimum_delete_sum

        # If we've reached the end of both strings
        if index_string1 == len(string1) and index_string2 == len(string2):
            if current_string1 == current_string2:
                delete_sum = calculate_delete_sum(deleted_characters)
                minimum_delete_sum = min(minimum_delete_sum, delete_sum)
            return

        # If we've reached the end of the first string
        if index_string1 == len(string1):
            remaining_characters = string2[index_string2:]
            find_minimum_delete_sum(index_string1, len(string2), current_string1, current_string2, deleted_characters + list(remaining_characters))
            return

        # If we've reached the end of the second string
        if index_string2 == len(string2):
            remaining_characters = string1[index_string1:]
            find_minimum_delete_sum(len(string1), index_string2, current_string1, current_string2, deleted_characters + list(remaining_characters))
            return

        # Option 1: Delete a character from string1
        find_minimum_delete_sum(index_string1 + 1, index_string2, current_string1, current_string2, deleted_characters + [string1[index_string1]])

        # Option 2: Delete a character from string2
        find_minimum_delete_sum(index_string1, index_string2 + 1, current_string1, current_string2, deleted_characters + [string2[index_string2]])

        # Because it could result in the smallest sum
        # Option 3: Keep both characters
        find_minimum_delete_sum(index_string1 + 1, index_string2 + 1, current_string1 + string1[index_string1], current_string2 + string2[index_string2], deleted_characters)

    find_minimum_delete_sum(0, 0, '', '', [])
    return minimum_delete_sum

Big(O) Analysis

Time Complexity
O(2^(m+n))The provided solution considers all possible subsets (deletions) of both strings. String 's1' of length 'm' has 2^m possible subsets, and string 's2' of length 'n' has 2^n possible subsets. For each combination of subsets from s1 and s2, we perform a comparison to check if they are equal and compute the deletion cost, which takes constant time O(1). Therefore, the total number of operations is proportional to 2^m * 2^n = 2^(m+n). This results in an exponential time complexity.
Space Complexity
O(1)The plain English explanation describes a brute-force approach involving iterative consideration of all possible deletions. No explicit data structures are mentioned for storing intermediate results or visited states. The algorithm focuses on comparing results after deletions. Therefore, it can be implemented using a constant amount of auxiliary space, primarily for storing a minimum value and potentially loop indices, independent of the input string lengths.

Optimal Solution

Approach

The best way to solve this problem is to use a strategy where we build up the solution gradually. We consider smaller parts of the problem and combine their solutions to solve the overall problem, avoiding redundant calculations. This prevents us from needing to check every single possible deletion.

Here's how the algorithm would work step-by-step:

  1. Imagine a table where rows and columns represent the strings. Each cell represents the minimum deletion cost to make the prefixes of the strings equal up to that point.
  2. Start by filling the first row and column. This represents deleting all characters from one string to match an empty string.
  3. For each cell in the table, consider two possibilities: either the last characters of the two strings match, or they don't.
  4. If the last characters match, the cost is the same as the cost for the cell diagonally above and to the left.
  5. If the characters don't match, consider two options: delete the character from the first string, or delete the character from the second string. Choose the option with the minimum cost, and add the ASCII value of the deleted character to the cost from the corresponding cell.
  6. Continue filling the table until you reach the bottom right corner. This cell will contain the minimum ASCII delete sum for the two strings.

Code Implementation

def minimum_delete_sum(string1, string2):
    string1_length = len(string1)
    string2_length = len(string2)

    # Initialize DP table
    dp_table = [[0 for _ in range(string2_length + 1)] for _ in range(string1_length + 1)]

    # Fill first row, cost of deleting all of string1
    for string1_index in range(1, string1_length + 1):
        dp_table[string1_index][0] = dp_table[string1_index - 1][0] + ord(string1[string1_index - 1])

    # Fill first column, cost of deleting all of string2
    for string2_index in range(1, string2_length + 1):
        dp_table[0][string2_index] = dp_table[0][string2_index - 1] + ord(string2[string2_index - 1])

    for string1_index in range(1, string1_length + 1):
        for string2_index in range(1, string2_length + 1):
            # If chars match, no deletion needed
            if string1[string1_index - 1] == string2[string2_index - 1]:
                dp_table[string1_index][string2_index] = dp_table[string1_index - 1][string2_index - 1]
            else:
                # Choose min cost of deleting from string1 or string2
                dp_table[string1_index][string2_index] = min(
                    dp_table[string1_index - 1][string2_index] + ord(string1[string1_index - 1]),
                    dp_table[string1_index][string2_index - 1] + ord(string2[string2_index - 1])
                )

    # Final value is in the bottom right
    return dp_table[string1_length][string2_length]

Big(O) Analysis

Time Complexity
O(m*n)The algorithm constructs a table (dp) of size (m+1)x(n+1), where m and n are the lengths of the two input strings s1 and s2, respectively. The algorithm iterates through each cell of the table. Filling each cell involves a constant amount of operations: checking for equality of characters, or comparing two values to find the minimum. Therefore, the time complexity is determined by the number of cells in the table, which is proportional to m * n. Hence, the overall time complexity is O(m*n).
Space Complexity
O(M*N)The algorithm utilizes a table (2D array) to store intermediate results, where the number of rows corresponds to the length of the first string (M) and the number of columns corresponds to the length of the second string (N). Therefore, the space required to store this table is proportional to the product of the lengths of the two input strings, M and N. This table is the primary driver of auxiliary space usage. Hence, the space complexity is O(M*N).

Edge Cases

CaseHow to Handle
Both strings are emptyReturn 0, as no deletions are needed.
One string is empty, the other is notReturn the sum of ASCII values of the non-empty string.
Strings are identicalReturn 0, no deletions are necessary.
Strings have one character each and they are the sameReturn 0 since no deletion is necessary.
Strings have one character each and they are differentReturn the sum of the ASCII values of the two characters.
Strings have maximum allowed length (e.g. 1000 for typical constraints)Ensure the dynamic programming table doesn't exceed memory limits and the algorithm's time complexity is efficient.
Strings contain non-ASCII charactersThe solution should correctly handle extended ASCII characters or throw an exception if it's designed for basic ASCII.
Strings contain only a single repeated character eachThe DP algorithm should handle this case efficiently, potentially requiring many similar computations.