Minimum ASCII Delete Sum for Two Strings Interview Question for Google

# Minimum ASCII Delete Sum for Two Strings

This problem asks us to find the minimum ASCII sum of deleted characters required to make two given strings equal. We can solve this problem using dynamic programming.

## Naive Approach (Recursion with Memoization)

We can define a recursive function `min_delete_sum(i, j)` that returns the minimum delete sum for the prefixes `s1[i:]` and `s2[j:]`. The base cases are:

*   If `i == len(s1)` and `j == len(s2)`, then both strings are empty, so the delete sum is 0.
*   If `i == len(s1)`, then we need to delete all characters from `s2[j:]`, so the delete sum is the sum of ASCII values of those characters.
*   If `j == len(s2)`, then we need to delete all characters from `s1[i:]`, so the delete sum is the sum of ASCII values of those characters.

If `s1[i] == s2[j]`, then we don't need to delete anything, so `min_delete_sum(i, j) = min_delete_sum(i + 1, j + 1)`. Otherwise, we have two choices:

*   Delete `s1[i]`, so `min_delete_sum(i, j) = ord(s1[i]) + min_delete_sum(i + 1, j)`.
*   Delete `s2[j]`, so `min_delete_sum(i, j) = ord(s2[j]) + min_delete_sum(i, j + 1)`.

We take the minimum of these two choices.

To avoid recomputing the same values, we can use memoization. Here's the code:

```python
def minimumDeleteSumRecursive(s1: str, s2: str) -> int:
    memo = {}

    def min_delete_sum(i, j):
        if (i, j) in memo:
            return memo[(i, j)]

        if i == len(s1) and j == len(s2):
            return 0
        if i == len(s1):
            return sum(ord(c) for c in s2[j:])
        if j == len(s2):
            return sum(ord(c) for c in s1[i:])

        if s1[i] == s2[j]:
            memo[(i, j)] = min_delete_sum(i + 1, j + 1)
        else:
            memo[(i, j)] = min(ord(s1[i]) + min_delete_sum(i + 1, j),
                                ord(s2[j]) + min_delete_sum(i, j + 1))
        return memo[(i, j)]

    return min_delete_sum(0, 0)

Optimal Approach (Dynamic Programming)

We can improve the space complexity by using dynamic programming with a 2D table. dp[i][j] represents the minimum delete sum to make s1[:i] and s2[:j] equal.

Initialization:
- dp[0][0] = 0 (empty strings are equal).
- dp[i][0] is the sum of ASCII values of s1[:i] (deleting all characters of s1[:i] to match an empty string).
- dp[0][j] is the sum of ASCII values of s2[:j] (deleting all characters of s2[:j] to match an empty string).
Iteration:
- If s1[i-1] == s2[j-1], then dp[i][j] = dp[i-1][j-1] (no deletion needed).
- Otherwise, dp[i][j] = min(dp[i-1][j] + ord(s1[i-1]), dp[i][j-1] + ord(s2[j-1])) (delete either s1[i-1] or s2[j-1] and take the minimum).
Result:
- dp[len(s1)][len(s2)] contains the minimum delete sum.

Here's the Python code:

def minimumDeleteSum(s1: str, s2: str) -> int:
    n = len(s1)
    m = len(s2)
    dp = [[0] * (m + 1) for _ in range(n + 1)]

    for i in range(1, n + 1):
        dp[i][0] = dp[i-1][0] + ord(s1[i-1])
    for j in range(1, m + 1):
        dp[0][j] = dp[0][j-1] + ord(s2[j-1])

    for i in range(1, n + 1):
        for j in range(1, m + 1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = min(dp[i-1][j] + ord(s1[i-1]), dp[i][j-1] + ord(s2[j-1]))

    return dp[n][m]

Example:

s1 = "sea", s2 = "eat"

Initialization:

dp = [
    [0, 101, 212, 328],
    [115, 0, 0, 0],
    [216, 0, 0, 0],
    [327, 0, 0, 0]
]

Iteration:

i=1, j=1: s1[0] = 's', s2[0] = 'e'. dp[1][1] = min(115+101, 101+101) = min(216, 202) = 202
i=1, j=2: s1[0] = 's', s2[1] = 'a'. dp[1][2] = min(216+101, 202+97) = min(317, 299) = 299
i=1, j=3: s1[0] = 's', s2[2] = 't'. dp[1][3] = min(216+101, 299+116) = min(317, 415) = 317
i=2, j=1: s1[1] = 'e', s2[0] = 'e'. dp[2][1] = dp[1][0] = 101
i=2, j=2: s1[1] = 'e', s2[1] = 'a'. dp[2][2] = min(101+101, 101+97) = min(202, 198) = 198
i=2, j=3: s1[1] = 'e', s2[2] = 't'. dp[2][3] = min(101+101, 198+116) = min(202, 314) = 202
i=3, j=1: s1[2] = 'a', s2[0] = 'e'. dp[3][1] = min(216+97, 101+115) = min(313, 216) = 216
i=3, j=2: s1[2] = 'a', s2[1] = 'a'. dp[3][2] = dp[2][1] = 101
i=3, j=3: s1[2] = 'a', s2[2] = 't'. dp[3][3] = min(216+97, 101+116) = min(313, 217) = 217

Final DP table:

dp = [
    [0, 101, 212, 328],
    [115, 216, 317, 433],
    [216, 101, 198, 314],
    [327, 216, 101, 217]
]

Thus, the final result is actually derived in a slightly different way than my walkthrough:

dp = [
    [0, 101, 212, 328],
    [115, 101+115, min(115+212, 101+115), min(115+328, 212+115)],
    [216, min(216+101, 101+115), min(216+212, 101+212), min(216+328, 101+328)],
    [327, min(327+101, 216+115), min(327+212, 216+212), min(327+328, 216+328)]
]
dp = [
    [0, 101, 212, 328],
    [115, 216, min(327, 216), min(443, 327)],
    [216, min(317, 216), min(428, 313), min(544, 429)],
    [327, min(428, 331), min(539, 428), min(655, 544)]
]
dp = [
    [0, 101, 212, 328],
    [115, 216, 216, 327],
    [216, 216, 313, 429],
    [327, 331, 428, 544]
]

So dp[3][3] = 231 is the answer.

Big(O) Analysis

Time Complexity: O(n*m), where n is the length of s1 and m is the length of s2, because we iterate through all cells of the dp table once.
Space Complexity: O(n*m) for the DP table. We could potentially reduce this to O(min(n, m)) by only storing the previous row of the DP table, but the code would be less readable.

Edge Cases

Empty Strings: If either string is empty, the result is the sum of ASCII values of the characters in the other string.
Identical Strings: If the strings are identical, the result is 0.
Large Strings: The constraints specify lengths up to 1000. We should consider potential integer overflow if the ASCII sums become very large, but Python handles large integers automatically.