Taro Logo

Match Substring After Replacement

Hard
Asked by:
Profile picture
8 views
Topics:
ArraysTwo PointersStringsSliding Windows

You are given two strings s and sub. You are also given a 2D character array mappings where mappings[i] = [oldi, newi] indicates that you may perform the following operation any number of times:

  • Replace a character oldi of sub with newi.

Each character in sub cannot be replaced more than once.

Return true if it is possible to make sub a substring of s by replacing zero or more characters according to mappings. Otherwise, return false.

A substring is a contiguous non-empty sequence of characters within a string.

Example 1:

Input: s = "fool3e7bar", sub = "leet", mappings = [["e","3"],["t","7"],["t","8"]]
Output: true
Explanation: Replace the first 'e' in sub with '3' and 't' in sub with '7'.
Now sub = "l3e7" is a substring of s, so we return true.

Example 2:

Input: s = "fooleetbar", sub = "f00l", mappings = [["o","0"]]
Output: false
Explanation: The string "f00l" is not a substring of s and no replacements can be made.
Note that we cannot replace '0' with 'o'.

Example 3:

Input: s = "Fool33tbaR", sub = "leetd", mappings = [["e","3"],["t","7"],["t","8"],["d","b"],["p","b"]]
Output: true
Explanation: Replace the first and second 'e' in sub with '3' and 'd' in sub with 'b'.
Now sub = "l33tb" is a substring of s, so we return true.

Constraints:

  • 1 <= sub.length <= s.length <= 5000
  • 0 <= mappings.length <= 1000
  • mappings[i].length == 2
  • oldi != newi
  • s and sub consist of uppercase and lowercase English letters and digits.
  • oldi and newi are either uppercase or lowercase English letters or digits.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can a character in `sub` have multiple possible replacements in `mappings`?
  2. If a character in `sub` is not present as an `oldChar` in any mapping, does it have to match the corresponding character in `s` exactly?
  3. Can `s` or `sub` be empty strings or null? If so, what should I return?
  4. What is the maximum length of the strings `s` and `sub`, and what is the maximum size of the `mappings` array?
  5. Are `oldChar` and `newChar` in the mappings guaranteed to be different, or could there be a mapping like ['a', 'a']?

Brute Force Solution

Approach

The brute force strategy for this problem is like trying every single possible replacement allowed by the given rules and then checking if, after those replacements, the substring matches within the main string. It's about exploring all options, even if it takes a long time.

Here's how the algorithm would work step-by-step:

  1. Consider all the possible combinations of replacements we can make, as the replacement rules suggest.
  2. For each possible combination of replacements, change the main string according to those replacements.
  3. Once the replacements are made, see if the substring exists within the modified main string.
  4. If the substring is found, we have a match and we can confirm the answer as yes.
  5. If after trying all possible replacement combinations, we never find the substring, then the answer is no.

Code Implementation

def match_substring_after_replacement_brute_force(
    main_string,
    substring,
    replacements
):
    number_of_replacements = len(replacements)
    
    # Generate all possible combinations of replacements
    replacement_combinations = [[]]
    for replacement_index in range(number_of_replacements):
        new_combinations = []
        for combination in replacement_combinations:
            new_combinations.append(combination)
            new_combinations.append(combination + [replacement_index])
        replacement_combinations = new_combinations

    # Iterate through all combinations of replacements
    for combination in replacement_combinations:
        modified_string = list(main_string)

        # Apply the current combination of replacements
        for replacement_index in combination:
            character_to_replace = replacements[replacement_index][0]
            replacement_character = replacements[replacement_index][1]
            for index, character in enumerate(modified_string):
                if character == character_to_replace:
                    modified_string[index] = replacement_character
        modified_string = "".join(modified_string)

        # Check if the substring exists in the modified string
        if substring in modified_string:
            # Substring found after replacements
            return True

    # If substring not found after all replacements
    return False

Big(O) Analysis

Time Complexity
O(3^r * n * m)Let n be the length of the string s, m be the length of the sub string sub, and r be the number of possible replacements. The brute force approach iterates through all possible combinations of replacements. For each character in the string s, there may be up to 3 replacements, so checking all possibilities gives a factor of approximately 3^r, where r is the number of possible replacements. For each of these combinations, we check for the existence of the substring of length m in the string s of length n using a naive string matching algorithm which requires O(n * m). Combining the replacement possibilities and the string matching results in a time complexity of O(3^r * n * m).
Space Complexity
O(K * L)The brute force approach explores all possible replacement combinations. If there are K possible replacements for a character in the string, and the substring's length is L, then in the worst-case, the algorithm might need to store K raised to the power of L possible modified strings to test if the substring exists. Each such string has a length proportional to the length of the main string which we will consider to be of the same order as the substring length L. The algorithm also stores a modified main string when exploring possible combinations for comparisons; the length of this modified string is bounded by the length of the input string, which is L. Thus the space complexity is O(K^L * L). We simplify K^L * L to K*L, because usually, brute force approaches in an interview setting are focused on simpler rules and may not attempt to replace many characters in the input string. This yields O(K*L) where K represents the number of replacements for a character in a substring of length L. Therefore we can represent the space complexity as O(K * L).

Optimal Solution

Approach

The fastest way to solve this involves transforming the 'mappings' into a more usable form. Then, instead of checking all possible combinations, we efficiently slide a 'window' across the main text, checking for possible matches based on the mappings.

Here's how the algorithm would work step-by-step:

  1. First, convert the replacement possibilities into a simple lookup table. This lets you quickly see what characters each original character can change into.
  2. Imagine a window that's exactly the size of the substring you're looking for. Slide this window across the main text, one character at a time.
  3. For each window position, compare each character in the window with the corresponding character in the substring.
  4. Use the lookup table to see if the characters match directly or if the character in the window can be changed into the character in the substring.
  5. If all characters in the window match (either directly or through a possible replacement), you've found a match!
  6. Keep sliding the window until you've checked every possible position in the main text.

Code Implementation

def match_substring_after_replacement(string, sub, mappings):
    replacement_map = {}
    for original_char, replacements in mappings:
        replacement_map[original_char] = set(replacements)

    string_length = len(string)
    substring_length = len(sub)

    for i in range(string_length - substring_length + 1):
        # Iterate through possible start positions of the substring.
        window = string[i:i + substring_length]
        match = True

        for j in range(substring_length):
            string_char = window[j]
            substring_char = sub[j]

            # Check if characters match directly or via replacement.
            if string_char != substring_char:
                if string_char not in replacement_map or substring_char not in replacement_map[string_char]:
                    match = False
                    break

        # If all characters match, return True.
        if match:
            return True

    return False

Big(O) Analysis

Time Complexity
O(m*k)Let 'm' be the length of the text and 'k' be the length of the substring. We iterate through the text with a sliding window of size 'k', resulting in 'm-k+1' iterations, which is approximately 'm' when m > k. For each window position, we compare each character in the window to the corresponding character in the substring. This character-by-character comparison takes O(k) time. Therefore, the overall time complexity is O(m*k) as we perform 'm' window shifts, each taking 'k' time to compare the characters.
Space Complexity
O(M)The dominant space usage comes from converting the 'mappings' into a lookup table, which could be a hash map or a similar data structure. This lookup table stores replacement possibilities for each character and in the worst case, could store data for all characters in mappings. If we consider M to be the total number of replacement mappings, the lookup table would require O(M) space. The sliding window approach itself uses constant space, only using variables to track the window's position. Therefore, the overall space complexity is O(M).

Edge Cases

CaseHow to Handle
s is null or emptyReturn false immediately as sub cannot be found in an empty string.
sub is null or emptyReturn true immediately, as an empty substring is always found.
mappings is null or emptyPerform a direct string comparison between s and sub without replacements.
s.length < sub.lengthReturn false immediately since sub cannot be contained in a shorter string.
mappings contains mappings to the same character (e.g., a -> b, b -> a)The algorithm should still function correctly, and potentially lead to longer computation, but should not error out.
sub contains characters not present in mappings at allThe algorithm will treat these characters as requiring an exact match.
Long s and sub strings to test performanceEnsure the solution's time complexity is considered for large inputs, avoiding brute force approaches if possible.
Mappings include characters not in s or subThe algorithm should handle these cases gracefully and simply ignore such mappings during the matching process.