Taro Logo

Word Pattern II

Medium
Dropbox logo
Dropbox
1 view
Topics:
StringsRecursionBacktracking

Given a pattern and a string s, find if s follows the same pattern.

Here follow means a full match, such that there is a bijection between a letter in pattern and a non-empty substring in s.

Example 1:

Input: pattern = "abab", s = "redblueredblue"
Output: true
Explanation: One possible mapping is: 
'a' -> "red"
'b' -> "blue"

Example 2:

Input: pattern = "aaaa", s = "asdasdasdasd"
Output: true
Explanation: One possible mapping is:
'a' -> "asd"

Example 3:

Input: pattern = "aabb", s = "xyzabcxzyabc"
Output: false

Constraints:

  • 1 <= pattern.length <= 20
  • 1 <= s.length <= 50
  • pattern and s contain only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the pattern contain duplicate characters, and can the string 's' contain repeated substrings?
  2. Can the pattern or the string 's' be empty or null?
  3. Is there a maximum length for the pattern and the string 's'?
  4. If multiple valid mappings exist, do I need to return all of them, or is returning 'true' sufficient?
  5. Is a character in the pattern guaranteed to map to a unique substring, or can it map to an empty substring (i.e., can a pattern character be 'skipped')?

Brute Force Solution

Approach

The brute force strategy for this puzzle involves trying every possible mapping between letters and parts of the string. It's like trying all the different ways to decode a secret message until you find the right one. If a mapping doesn't work, we simply discard it and try something else.

Here's how the algorithm would work step-by-step:

  1. Start by choosing what part of the string the first letter in the pattern represents.
  2. Then, choose what part of the string the second letter in the pattern represents.
  3. Continue choosing parts of the string for each letter in the pattern.
  4. As you choose, keep track of which parts of the string have already been used so you don't accidentally use the same part for two different letters.
  5. If at any point you realize the string doesn't match the pattern (for example, a letter maps to two different parts of the string, or the chosen parts don't add up to the whole string), stop and go back to try a different choice.
  6. If you manage to assign a part of the string to every letter in the pattern and the whole string is used up, check if the mapping is consistent. That means making sure each letter always maps to the same part of the string and no part of the string maps to two different letters.
  7. If the mapping is consistent, you've found a solution! If not, keep trying different combinations until you either find a solution or run out of possibilities.

Code Implementation

def word_pattern_match(pattern, input_string):
    pattern_to_string = {}
    string_to_pattern = {}

    def backtrack(pattern_index, string_index):
        # Base case: Both pattern and string are fully matched.
        if pattern_index == len(pattern) and string_index == len(input_string):
            return True
        # If either pattern or string is exhausted while the other is not.
        if pattern_index == len(pattern) or string_index == len(input_string):
            return False

        current_pattern_character = pattern[pattern_index]

        # Try all possible substrings for the current pattern character.
        for i in range(string_index + 1, len(input_string) + 1):
            substring = input_string[string_index:i]

            # Check if the pattern character already has a mapping.
            if current_pattern_character in pattern_to_string:
                # If the existing mapping doesn't match the current substring, skip.
                if pattern_to_string[current_pattern_character] != substring:
                    continue
                # Continue with the next character in the pattern.
                if backtrack(pattern_index + 1, i):
                    return True
            else:
                # Ensure the substring is not already mapped to another pattern.
                if substring in string_to_pattern:
                    continue

                # Map the current pattern character to the current substring
                pattern_to_string[current_pattern_character] = substring
                string_to_pattern[substring] = current_pattern_character

                # Attempt to match the rest of the pattern and string.
                if backtrack(pattern_index + 1, i):
                    return True

                # Backtrack: Remove the mapping to explore other possibilities.
                del pattern_to_string[current_pattern_character]
                del string_to_pattern[substring]

        return False

    return backtrack(0, 0)

Big(O) Analysis

Time Complexity
O(m^n)The algorithm explores all possible mappings between characters in the pattern (of length n) and non-empty substrings of the string (of length m). In the worst case, each character in the pattern could map to any substring of the string. This leads to a branching factor of 'm' (the length of the string) for each character in the pattern. Since there are 'n' characters in the pattern, the total number of possibilities explored can grow up to m * m * ... * m (n times), resulting in a time complexity of O(m^n), where 'm' is the length of the string and 'n' is the length of the pattern.
Space Complexity
O(2^N)The algorithm uses a hash map to store the mappings between characters in the pattern and substrings of the string, and another set to track which substrings have already been used. In the worst case, the number of possible substrings can be exponential in relation to the length of the string (N). Furthermore, due to the recursive nature of the brute force search, the call stack could also reach a depth proportional to the number of characters in the pattern, where each level stores the mappings. Considering that each substring consumes space proportional to its length, and there are potentially an exponential number of substrings to consider, the overall auxiliary space complexity is O(2^N).

Optimal Solution

Approach

The goal is to determine if a pattern of letters can be matched to segments of a string. We solve this using a backtracking approach, where we explore possible mappings between letters and string segments, pruning branches when they don't lead to a valid solution.

Here's how the algorithm would work step-by-step:

  1. Start by trying to match the first letter of the pattern to different possible starting substrings of the given string.
  2. If a match is found, tentatively store this mapping (letter to substring).
  3. Move to the next letter in the pattern and try to find a corresponding substring in the remaining part of the input string.
  4. Before confirming any new mapping, check if either the current letter or the candidate substring are already used in another mapping. If there's a conflict, then abandon this matching attempt.
  5. If everything's consistent, temporarily confirm the mapping and proceed recursively to the next letter and remaining string.
  6. If the entire pattern and string are consumed successfully, a valid match is found, return true.
  7. If at any point, you cannot find a valid mapping (no substrings match or there's a conflict), undo the previous tentative mappings (backtrack) and try different substring lengths for the earlier letters.
  8. If all possibilities have been exhausted and no valid matching is found, return false.

Code Implementation

def word_pattern_match(pattern, input_string):
    pattern_map = {}
    string_map = {}

    def backtrack(pattern_index, string_index):
        # Base case: both pattern and string are exhausted.
        if pattern_index == len(pattern) and string_index == len(input_string):
            return True

        # If either pattern or string is exhausted while the other is not, it's a mismatch.
        if pattern_index == len(pattern) or string_index == len(input_string):
            return False

        current_pattern_char = pattern[pattern_index]

        # Try all possible substrings for the current character.
        for string_end_index in range(string_index + 1, len(input_string) + 1):
            substring = input_string[string_index:string_end_index]

            # Check if the mapping is consistent.
            if current_pattern_char in pattern_map:
                if pattern_map[current_pattern_char] != substring:
                    continue
            elif substring in string_map:
                continue

            # Tentatively add the mapping and proceed recursively.
            pattern_map[current_pattern_char] = substring
            string_map[substring] = current_pattern_char

            # Recursively check if the rest of the pattern matches the rest of the string.
            if backtrack(pattern_index + 1, string_end_index):
                return True

            # Backtrack: remove the mapping to explore other possibilities.
            del string_map[substring]
            del pattern_map[current_pattern_char]

        return False

    return backtrack(0, 0)

Big(O) Analysis

Time Complexity
O(n!)The backtracking algorithm explores all possible mappings between characters in the pattern and non-overlapping substrings in the input string. In the worst case, each character in the pattern could potentially map to any substring of the input string. The number of possible substrings of a string of length n is O(n^2), and the algorithm might need to try nearly all possible combinations. This leads to a complexity of approximately O(n!), where n is the length of the input string because for each pattern character, we may have to explore all substrings of the input string, and the number of possible mappings grows factorially. The algorithm prunes invalid branches early which can improve practical performance, but the worst-case complexity remains exponential.
Space Complexity
O(N)The primary space complexity comes from the recursion depth of the backtracking algorithm. In the worst-case scenario, the pattern might consist of distinct characters, and each recursive call attempts to map a pattern character to a single character substring. This leads to a maximum recursion depth proportional to the length of the string, denoted as N. Additionally, the algorithm uses two hash maps to store the mapping between characters in the pattern and substrings in the string, along with a set for reverse mapping validation, each potentially storing at most N entries in the worst case where each character in the pattern maps to a unique one-character substring. Therefore, the space complexity is dominated by the recursion depth and hash map sizes, leading to O(N) auxiliary space.

Edge Cases

CaseHow to Handle
pattern is null or empty stringReturn true if s is also empty; otherwise, return false.
s is null or empty stringReturn true if pattern is also empty; otherwise, return false.
pattern length is greater than s lengthReturn false, as there's no way to map pattern characters to non-empty substrings.
pattern and s have length 1Check if the single character matches; return true if they do, false if not.
All characters in pattern are the same and all characters in s are the sameDetermine if s can be divided into segments of equal characters to match length of the pattern.
Two different pattern characters map to the same substring in 's'The algorithm should detect conflicting mappings during the backtracking search and reject that branch.
A single pattern character maps to different substrings in 's'The algorithm should detect conflicting mappings during the backtracking search and reject that branch.
The pattern contains repeated characters and s does not have corresponding repeating segments.The backtracking algorithm will exhaust all possibilities and return false if no valid mapping is found.