Taro Logo

Append Characters to String to Make Subsequence

Medium
Google logo
Google
5 views
Topics:
StringsTwo Pointers

You are given two strings s and t consisting of only lowercase English letters.

Return the minimum number of characters that need to be appended to the end of s so that t becomes a subsequence of s.

A subsequence is a string that can be derived from another string by deleting some or no characters without changing the order of the remaining characters.

Example 1:

Input: s = "coaching", t = "coding"
Output: 4
Explanation: Append the characters "ding" to the end of s so that s = "coachingding".
Now, t is a subsequence of s ("coachingding").
It can be shown that appending any 3 characters to the end of s will never make t a subsequence.

Example 2:

Input: s = "abcde", t = "a"
Output: 0
Explanation: t is already a subsequence of s ("abcde").

Example 3:

Input: s = "z", t = "abcde"
Output: 5
Explanation: Append the characters "abcde" to the end of s so that s = "zabcde".
Now, t is a subsequence of s ("zabcde").
It can be shown that appending any 4 characters to the end of s will never make t a subsequence.

Constraints:

  • 1 <= s.length, t.length <= 105
  • s and t consist only of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the characters in `t` and `s` only be lowercase English letters, or can they contain other characters?
  2. Are `s` and `t` guaranteed to be non-empty?
  3. If `t` is not a subsequence of `s`, should I return the minimum number of characters that need to be appended to `s` to make `t` a subsequence, or is there another error case?
  4. Is the order of characters in `t` that are also in `s` important (i.e., should they appear in the same order as in `s` to be considered a subsequence)?
  5. What is the maximum possible length of strings `s` and `t`?

Brute Force Solution

Approach

The brute force method in this problem is all about trying every possible combination. We will try adding characters to the source string to see if we can form the target as a subsequence. The key is to explore every single possibility.

Here's how the algorithm would work step-by-step:

  1. Start by considering the first character of the target string.
  2. Check if the first character of the target already exists in the source string. If it does, move on to the next character of the target.
  3. If the first character of the target is not in the source string, then try adding that character to the source string at every single possible position.
  4. After adding a character, check if the first character of the target exists as a subsequence (in order) now.
  5. Repeat these steps for the second character of the target and so on.
  6. Continue until you have either formed the entire target as a subsequence or you have tried adding a lot of extra characters and exceeded some limit. If the limit is exceeded, then stop.
  7. Keep track of the minimum number of characters that needed to be added to the source to make the target a subsequence.
  8. Return this minimum count. If no solution is found after trying everything, then report that it is impossible.

Code Implementation

def append_characters_to_string_to_make_subsequence(
    source_string, target_string
):
    minimum_appends = float('inf')

    def solve(current_source, target_index, appends_count):
        nonlocal minimum_appends

        # If we have processed all characters of the target, update min
        if target_index == len(target_string):
            minimum_appends = min(minimum_appends, appends_count)
            return

        # If we exceed the current minimum, stop exploring
        if appends_count >= minimum_appends:
            return

        source_index = 0
        while source_index < len(current_source) and \
                target_string[target_index] != current_source[source_index]:
            source_index += 1

        # Found the next character, move to the next target char
        if source_index < len(current_source):
            solve(
                current_source, target_index + 1, appends_count
            )
        else:
            # Need to append a character
            for insert_position in range(len(current_source) + 1):

                # Create a new string with the character added
                new_source = current_source[:insert_position] + \
                    target_string[target_index] + \
                    current_source[insert_position:]

                # Recursively solve with the updated string
                solve(new_source, target_index + 1, appends_count + 1)

    solve(source_string, 0, 0)

    if minimum_appends == float('inf'):
        return -1
    else:
        return minimum_appends

Big(O) Analysis

Time Complexity
O(m * n^m)The provided brute force algorithm explores all possible ways to insert characters to form a subsequence. Let n be the length of the source string and m be the length of the target string. For each character in the target string that is not a subsequence of the source, the algorithm tries inserting it at every possible position in the source string. In the worst case, it may need to insert characters close to the target length. Inserting a character at all possible positions of the source string can take O(n) operations. Since it attempts to add characters to the source string and makes recursive calls to try combinations, in the worst case, the algorithm complexity would involve m nested loops each considering n positions, roughly O(n^m), and further multiplied by m for checking the subsequence on each modification leading to a complexity of O(m*n^m).
Space Complexity
O(1)The provided plain English explanation describes a brute force approach that focuses on modifying the source string and checking for subsequences iteratively. While the description mentions 'trying every possible combination' by adding characters, it does not explicitly state that these combinations are stored or managed using any auxiliary data structures. The algorithm primarily involves comparisons and potential modifications of the source string in place and tracking the minimum character additions. The minimum count is stored in a single variable. Therefore, the auxiliary space complexity remains constant, irrespective of the input string sizes.

Optimal Solution

Approach

The goal is to find the fewest letters we need to add to one string to make another string appear within it as a subsequence. The best way to do this is to carefully step through both strings, matching characters whenever we can and adding new ones only when absolutely necessary.

Here's how the algorithm would work step-by-step:

  1. Start by looking at the first character of both strings.
  2. If the characters match, move to the next character in both strings.
  3. If the characters don't match, it means we need to add the missing character to the first string. Count this addition and only move to the next character in the first string (the one we are building up).
  4. Keep repeating steps 2 and 3 until you have processed all the characters in the second string (the one we want to see as a subsequence).
  5. The number of characters you've added is the answer.

Code Implementation

def append_characters_to_string_to_make_subsequence(string_one, string_two):
    pointer_one = 0
    pointer_two = 0
    characters_added_count = 0

    while pointer_two < len(string_two):

        if pointer_one < len(string_one) and string_one[pointer_one] == string_two[pointer_two]:
            pointer_one += 1
            pointer_two += 1
        else:
            # We need to add a character to string_one
            characters_added_count += 1

            # Only increment pointer_one because we're adding to string_one
            pointer_one += 1

    # Return the count of added characters
    return characters_added_count

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through both input strings s and t using two index variables, i and j, respectively. The while loop continues as long as the index j is less than the length of the target string t. Inside the loop, there's a comparison between characters s[i] and t[j]. The index i increments in each iteration of the outer loop and j increments only when characters match. Therefore, the number of operations are directly proportional to the length of target string t (n). Hence the time complexity is O(n).
Space Complexity
O(1)The algorithm operates by iterating through the input strings using index variables. It does not create any auxiliary data structures such as lists, arrays, or hash maps to store intermediate results. Only a counter for the number of appends and index variables are needed. Therefore, the space complexity remains constant, irrespective of the lengths of the input strings, resulting in O(1) space complexity.

Edge Cases

CaseHow to Handle
Null or empty target stringIf target is null or empty, return 0 as no characters need to be appended.
Null or empty source stringIf source is null or empty, return the length of target, as all characters need to be appended.
Target string is longer than source string and requires many appendsThe solution should iterate efficiently through both strings, appending as needed.
Target string is already a subsequence of the source stringThe algorithm should return 0 as no characters need to be appended.
Source and target strings are identicalThe algorithm should return 0 as no characters need to be appended.
All characters in the target string are not present in the source stringThe solution should append all characters in the target string, resulting in appending target.length characters.
Very long source and target strings exceeding memory limitationsConsider using an iterative approach with constant space if memory becomes a bottleneck, instead of recursive calls or large auxiliary data structures.
Characters in target string appear in source string but in different orderThe algorithm should correctly identify the characters that need appending based on the subsequence relationship.