Taro Logo

Minimum Cost to Convert String II

Hard
Atlassian logo
Atlassian
1 view

You are given two 0-indexed strings source and target, both of length n and consisting of lowercase English characters. You are also given two 0-indexed string arrays original and changed, and an integer array cost, where cost[i] represents the cost of converting the string original[i] to the string changed[i].

You start with the string source. In one operation, you can pick a substring x from the string, and change it to y at a cost of z if there exists any index j such that cost[j] == z, original[j] == x, and changed[j] == y. You are allowed to do any number of operations, but any pair of operations must satisfy either of these two conditions:

  • The substrings picked in the operations are source[a..b] and source[c..d] with either b < c or d < a. In other words, the indices picked in both operations are disjoint.
  • The substrings picked in the operations are source[a..b] and source[c..d] with a == c and b == d. In other words, the indices picked in both operations are identical.

Return the minimum cost to convert the string source to the string target using any number of operations. If it is impossible to convert source to target, return -1.

Note that there may exist indices i, j such that original[j] == original[i] and changed[j] == changed[i].

Example 1:

Input: source = "abcd", target = "acbe", original = ["a","b","c","c","e","d"], changed = ["b","c","b","e","b","e"], cost = [2,5,5,1,2,20]
Output: 28
Explanation: To convert "abcd" to "acbe", do the following operations:
- Change substring source[1..1] from "b" to "c" at a cost of 5.
- Change substring source[2..2] from "c" to "e" at a cost of 1.
- Change substring source[2..2] from "e" to "b" at a cost of 2.
- Change substring source[3..3] from "d" to "e" at a cost of 20.
The total cost incurred is 5 + 1 + 2 + 20 = 28. 
It can be shown that this is the minimum possible cost.

Example 2:

Input: source = "abcdefgh", target = "acdeeghh", original = ["bcd","fgh","thh"], changed = ["cde","thh","ghh"], cost = [1,3,5]
Output: 9
Explanation: To convert "abcdefgh" to "acdeeghh", do the following operations:
- Change substring source[1..3] from "bcd" to "cde" at a cost of 1.
- Change substring source[5..7] from "fgh" to "thh" at a cost of 3. We can do this operation because indices [5,7] are disjoint with indices picked in the first operation.
- Change substring source[5..7] from "thh" to "ghh" at a cost of 5. We can do this operation because indices [5,7] are disjoint with indices picked in the first operation, and identical with indices picked in the second operation.
The total cost incurred is 1 + 3 + 5 = 9.
It can be shown that this is the minimum possible cost.

Example 3:

Input: source = "abcdefgh", target = "addddddd", original = ["bcd","defgh"], changed = ["ddd","ddddd"], cost = [100,1578]
Output: -1
Explanation: It is impossible to convert "abcdefgh" to "addddddd".
If you select substring source[1..3] as the first operation to change "abcdefgh" to "adddefgh", you cannot select substring source[3..7] as the second operation because it has a common index, 3, with the first operation.
If you select substring source[3..7] as the first operation to change "abcdefgh" to "abcddddd", you cannot select substring source[1..3] as the second operation because it has a common index, 3, with the first operation.

Constraints:

  • 1 <= source.length == target.length <= 1000
  • source, target consist only of lowercase English characters.
  • 1 <= cost.length == original.length == changed.length <= 100
  • 1 <= original[i].length == changed[i].length <= source.length
  • original[i], changed[i] consist only of lowercase English characters.
  • original[i] != changed[i]
  • 1 <= cost[i] <= 106

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible values for characters in `source`, `target`, and the keys/values in the `changeCost` map? Are we dealing with ASCII characters, or a broader character set?
  2. Can the `changeCost` map contain entries for converting a character to itself (e.g., 'a' to 'a')? If so, what would the associated cost be, and should I assume it's always zero?
  3. If there's no possible way to convert the `source` string to the `target` string, what should I return? Should I return -1, positive infinity, or throw an exception?
  4. What are the constraints on the lengths of `source` and `target`? Could they be empty strings or have different lengths?
  5. Does the `changeCost` map always provide a cost for converting any character that *can* be converted, or might there be implicit infinite costs if a character conversion isn't explicitly defined?

Brute Force Solution

Approach

The brute force method for finding the minimum cost to convert a string involves exploring all possible combinations of replacements. We essentially try every single valid substitution we can make at every possible starting location within the string. By trying every possibility, we guarantee we will find the one with the lowest cost.

Here's how the algorithm would work step-by-step:

  1. Look at every possible starting position within the original string.
  2. For each starting position, check if any of the words we are allowed to replace the original string with can actually fit starting at that position.
  3. If a word fits, consider making the replacement. This creates a new, slightly modified version of the original string.
  4. Calculate the cost of making that specific replacement.
  5. Now, repeat the whole process on this new, modified string. Keep doing this until there are no more possible replacements left to make.
  6. We will have many different fully transformed strings, each one obtained by making a different series of replacements.
  7. For each of these final transformed strings, calculate the total cost, which is the sum of all the individual replacement costs that it took to create it.
  8. Finally, compare the total costs of all the transformed strings, and choose the one with the lowest cost. This will be the minimum cost to convert the original string.

Code Implementation

def minimum_cost_to_convert_string_brute_force(original_string, conversion_costs, target_words):
    minimum_total_cost = float('inf')

    def solve(current_string, current_cost):
        nonlocal minimum_total_cost

        # If the current cost is already higher than the minimum, stop
        if current_cost >= minimum_total_cost:
            return

        no_replacement_possible = True

        for starting_index in range(len(current_string)):
            for word_index in range(len(target_words)):
                target_word = target_words[word_index]
                conversion_cost = conversion_costs[word_index]

                if current_string[starting_index:starting_index + len(target_word)] == target_word:
                    no_replacement_possible = False

                    # Explore the possibility of making this replacement.
                    new_string = current_string[:starting_index] + target_word + current_string[starting_index + len(target_word):]

                    solve(new_string, current_cost + conversion_cost)

        # We have processed all possible replacements
        if no_replacement_possible:
            minimum_total_cost = min(minimum_total_cost, current_cost)

    # Begin the recursive process with original string and cost 0.
    solve(original_string, 0)

    # After complete recursive calls, return lowest cost.
    return minimum_total_cost

Big(O) Analysis

Time Complexity
O(k^n)The algorithm explores all possible replacements within the original string. For each starting position in the string of length n, it checks if any of the k replacement words fit. Each successful replacement creates a new string to explore. Since each replacement opens up new possibilities, and there are multiple possible replacement locations, this can result in a branching factor of approximately k at each of the n positions in the original string, leading to an exponential number of paths. Therefore, the time complexity is O(k^n) where k is the effective average number of possible replacements at each position, acknowledging the branching nature and dependencies of replacement.
Space Complexity
O(N^2)The brute force approach explores all possible combinations of replacements by creating slightly modified versions of the original string. In the worst case, each character of the original string (of length N) could be the starting point for a replacement and since we are considering all possible series of replacements the total modified strings can grow exponentially. Storing these modified strings in memory contributes O(N^2) space in the worst case since each modified string can be of length N and there can be up to N number of branches due to trying every starting point. Thus the space complexity is O(N^2) as N is string length

Optimal Solution

Approach

The optimal solution uses dynamic programming to find the minimum cost. We build up solutions for smaller substrings to determine the best way to convert the entire string by reusing previously calculated results.

Here's how the algorithm would work step-by-step:

  1. First, create a way to quickly look up the cost to change one character to another.
  2. Think about the problem backwards. Start from the end of the string and figure out the best way to convert it.
  3. For each position in the string, check if converting the substring starting at that position results in a lower cost than already calculated.
  4. When checking, consider if you can perform one conversion, two conversions, and so on, and keep the minimum possible cost.
  5. Remember to use previously calculated costs for the remaining part of the string after the conversion, avoiding recalculations and making the entire algorithm faster.
  6. Store these costs as you move through the string. This is where dynamic programming comes in handy.
  7. Finally, the cost stored for the very beginning of the string will be the minimum cost to convert the whole string.

Code Implementation

def minimum_cost_to_convert_string(source_string, target_string, conversion_costs):
    string_length = len(source_string)
    cost_table = {}

    # Prepare a cost lookup table for character conversions.
    for start_character, end_character, cost in conversion_costs:
        cost_table[(start_character, end_character)] = cost

    dp = [0] * (string_length + 1)

    for i in range(string_length - 1, -1, -1):
        dp[i] = dp[i + 1] + 1  # Initialize cost to deleting the character

        for j in range(i, string_length):
            substring = source_string[i:j+1]
            target_substring = target_string[i:j+1]

            if substring == target_substring:
                dp[i] = min(dp[i], dp[j + 1])
            else:
                if (source_string[i], target_string[i]) in cost_table:
                     dp[i] = min(dp[i], cost_table[(source_string[i], target_string[i])] + dp[i+1])

            # Check if a valid conversion exists for this substring
            if source_string[i:j+1] != target_string[i:j+1] and len(source_string[i:j+1]) == len(target_string[i:j+1]):
                conversion_needed = True
                cost_for_conversion = 0
                for k in range(len(source_string[i:j+1])):
                    if (source_string[i:j+1][k], target_string[i:j+1][k]) not in cost_table:
                        conversion_needed = False
                        break
                    else:
                        cost_for_conversion += cost_table[(source_string[i:j+1][k], target_string[i:j+1][k])]

                if conversion_needed:
                    dp[i] = min(dp[i], cost_for_conversion + dp[j+1])

    # The minimum cost to convert the whole string.
    return dp[0]

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through the string of length n backwards, which contributes a factor of n. Inside this loop, for each starting position, the algorithm considers all possible substring lengths up to the remaining length of the string, resulting in another loop that, in the worst case, iterates up to n times. Therefore, the dominant operations are nested loops over the string's length, leading to approximately n * n/2 operations. This simplifies to a time complexity of O(n²).
Space Complexity
O(N)The described dynamic programming solution uses an array to store the minimum costs calculated for each substring, starting from the end of the main string. This array, representing intermediate results, has a size equal to the length of the input string. Therefore, the auxiliary space required increases linearly with the length of the input string, N, where N is the length of the input string. This leads to a space complexity of O(N).

Edge Cases

CaseHow to Handle
Null or empty input string 's'Return 0 as no conversion is needed for an empty string.
Empty 'sources' or 'targets' list.Return 0 because no replacement operations are available.
Empty string in 'sources' list.The algorithm should skip empty strings as they don't contribute to replacements.
Overlapping source strings at the same index in 's'.Prioritize the longest matching source string to ensure the minimum cost.
Large input string 's' exceeding memory limitations.Optimize the algorithm to use a rolling hash or other memory-efficient data structures.
Maximum integer value cost leading to overflowUse long data type for cost variables or implement checks to avoid overflows.
No possible conversion: No source string matches any part of string 's'.Return 0 if no replacements are made, implying no conversion cost.
Source string equal to target string.Skip this replacement, as the cost is zero.