Minimum Additions to Make Valid String

Medium

Google

2 views

Topics:

StringsDynamic Programming

Given a string word to which you can insert letters "a", "b" or "c" anywhere and any number of times, return the minimum number of letters that must be inserted so that word becomes valid.

A string is called valid if it can be formed by concatenating the string "abc" several times.

For example:

If word = "b", the output is 2. We can insert the letter "a" right before "b", and the letter "c" right next to "b" to obtain the valid string "abc".
If word = "aaa", the output is 6. We can insert letters "b" and "c" next to each "a" to obtain the valid string "abcabcabc".
If word = "abc", the output is 0. The word is already valid, no modifications are needed.

Constraints:

1 <= word.length <= 50
word consists of letters "a", "b" and "c" only.

Minimum Additions to Make Valid String

Medium

Google

2 views

Topics:

StringsDynamic Programming

Given a string word to which you can insert letters "a", "b" or "c" anywhere and any number of times, return the minimum number of letters that must be inserted so that word becomes valid.

A string is called valid if it can be formed by concatenating the string "abc" several times.

For example:

If word = "b", the output is 2. We can insert the letter "a" right before "b", and the letter "c" right next to "b" to obtain the valid string "abc".
If word = "aaa", the output is 6. We can insert letters "b" and "c" next to each "a" to obtain the valid string "abcabcabc".
If word = "abc", the output is 0. The word is already valid, no modifications are needed.

Constraints:

1 <= word.length <= 50
word consists of letters "a", "b" and "c" only.

Solution

Naive Approach

The most straightforward approach is to consider all possible insertion combinations and check if the resulting string is a valid concatenation of "abc". This involves recursively trying to insert 'a', 'b', or 'c' at every possible position in the word and checking if the modified word is valid. A word is considered valid if it is composed entirely of repeating "abc" subsequences.

However, this approach is highly inefficient due to the exponential number of possible combinations of inserted characters.

Code (Python):

def is_valid(s):
    if not s:
        return True
    if len(s) % 3 != 0:
        return False
    for i in range(0, len(s), 3):
        if s[i:i+3] != "abc":
            return False
    return True

def min_insertions_naive(word):
    def solve(current_word, insertions):
        if is_valid(current_word):
            return insertions
        if len(current_word) > 50 * 3:  # Optimization: Limit recursion depth
            return float('inf')

        min_inserts = float('inf')
        for i in range(len(current_word) + 1):
            min_inserts = min(min_inserts, solve(current_word[:i] + 'a' + current_word[i:], insertions + 1))
            min_inserts = min(min_inserts, solve(current_word[:i] + 'b' + current_word[i:], insertions + 1))
            min_inserts = min(min_inserts, solve(current_word[:i] + 'c' + current_word[i:], insertions + 1))
        return min_inserts

    return solve(word, 0)

Complexity Analysis

Time Complexity: Exponential, O(3^N) where N is the final length of string.
Space Complexity: O(N), due to the recursive call stack.

Optimal Approach: Dynamic Programming

A better approach is to use dynamic programming. We can define dp[i][j] as the minimum number of insertions needed to make the first i characters of word a valid prefix ending with the j-th character of "abc".

j = 0: The prefix ends with 'a'.
j = 1: The prefix ends with 'b'.
j = 2: The prefix ends with 'c'.

Base Case: dp[0][0] = 0 (empty string needs no insertion to end with 'a'). dp[0][1] = dp[0][2] = float('inf')

Transitions:

If word[i-1] == 'a': dp[i][0] = dp[i-1][0] Else: dp[i][0] = min(dp[i-1][0], dp[i-1][1], dp[i-1][2]) + 1
If word[i-1] == 'b': dp[i][1] = dp[i-1][0] Else: dp[i][1] = dp[i-1][0] + 1
If word[i-1] == 'c': dp[i][2] = dp[i-1][1] Else: dp[i][2] = dp[i-1][1] + 1

The answer will be dp[n][2], where n is the length of word. If dp[n][2] is still infinity, then the only other way to make the entire string valid is to pad the word with "abc" chunks until we get a valid string. The padding will be equal to len(word) + 2 - (len(word) % 3 )

Edge Cases:

Empty input word: Requires 0 insertion. We should account for cases where length word % 3 != 0
Input contains invalid characters (not 'a', 'b', or 'c'): This should be handled via input validation.
Already valid word: Should return 0.

Code (Python):

def min_insertions(word):
    n = len(word)
    dp = [[float('inf')] * 3 for _ in range(n + 1)]
    dp[0][0] = 0

    for i in range(1, n + 1):
        if word[i - 1] == 'a':
            dp[i][0] = dp[i - 1][0]
        else:
            dp[i][0] = min(dp[i - 1][0], dp[i - 1][1], dp[i - 1][2]) + 1

        if word[i - 1] == 'b':
            dp[i][1] = dp[i - 1][0]
        else:
            dp[i][1] = dp[i - 1][0] + 1

        if word[i - 1] == 'c':
            dp[i][2] = dp[i - 1][1]
        else:
            dp[i][2] = dp[i - 1][1] + 1

    if dp[n][2] != float('inf'):
        return dp[n][2]
    else:
        return n + (2 - (n % 3)) % 3

Complexity Analysis

Time Complexity: O(N), where N is the length of the word.
Space Complexity: O(N), for the DP table. Can be optimized to O(1) by storing only the previous row.

Solution

Naive Approach

However, this approach is highly inefficient due to the exponential number of possible combinations of inserted characters.

Code (Python):

def is_valid(s):
    if not s:
        return True
    if len(s) % 3 != 0:
        return False
    for i in range(0, len(s), 3):
        if s[i:i+3] != "abc":
            return False
    return True

def min_insertions_naive(word):
    def solve(current_word, insertions):
        if is_valid(current_word):
            return insertions
        if len(current_word) > 50 * 3:  # Optimization: Limit recursion depth
            return float('inf')

        min_inserts = float('inf')
        for i in range(len(current_word) + 1):
            min_inserts = min(min_inserts, solve(current_word[:i] + 'a' + current_word[i:], insertions + 1))
            min_inserts = min(min_inserts, solve(current_word[:i] + 'b' + current_word[i:], insertions + 1))
            min_inserts = min(min_inserts, solve(current_word[:i] + 'c' + current_word[i:], insertions + 1))
        return min_inserts

    return solve(word, 0)

Complexity Analysis

Time Complexity: Exponential, O(3^N) where N is the final length of string.
Space Complexity: O(N), due to the recursive call stack.

Optimal Approach: Dynamic Programming

j = 0: The prefix ends with 'a'.
j = 1: The prefix ends with 'b'.
j = 2: The prefix ends with 'c'.

Base Case: dp[0][0] = 0 (empty string needs no insertion to end with 'a'). dp[0][1] = dp[0][2] = float('inf')

Transitions:

If word[i-1] == 'a': dp[i][0] = dp[i-1][0] Else: dp[i][0] = min(dp[i-1][0], dp[i-1][1], dp[i-1][2]) + 1
If word[i-1] == 'b': dp[i][1] = dp[i-1][0] Else: dp[i][1] = dp[i-1][0] + 1
If word[i-1] == 'c': dp[i][2] = dp[i-1][1] Else: dp[i][2] = dp[i-1][1] + 1

Edge Cases:

Empty input word: Requires 0 insertion. We should account for cases where length word % 3 != 0
Input contains invalid characters (not 'a', 'b', or 'c'): This should be handled via input validation.
Already valid word: Should return 0.

Code (Python):

def min_insertions(word):
    n = len(word)
    dp = [[float('inf')] * 3 for _ in range(n + 1)]
    dp[0][0] = 0

    for i in range(1, n + 1):
        if word[i - 1] == 'a':
            dp[i][0] = dp[i - 1][0]
        else:
            dp[i][0] = min(dp[i - 1][0], dp[i - 1][1], dp[i - 1][2]) + 1

        if word[i - 1] == 'b':
            dp[i][1] = dp[i - 1][0]
        else:
            dp[i][1] = dp[i - 1][0] + 1

        if word[i - 1] == 'c':
            dp[i][2] = dp[i - 1][1]
        else:
            dp[i][2] = dp[i - 1][1] + 1

    if dp[n][2] != float('inf'):
        return dp[n][2]
    else:
        return n + (2 - (n % 3)) % 3

Complexity Analysis

Time Complexity: O(N), where N is the length of the word.
Space Complexity: O(N), for the DP table. Can be optimized to O(1) by storing only the previous row.