Taro Logo

Word Pattern

Easy
Dropbox logo
Dropbox
3 views
Topics:
Strings

Given a pattern and a string s, find if s follows the same pattern.

Here follow means a full match, such that there is a bijection between a letter in pattern and a non-empty word in s. Specifically:

  • Each letter in pattern maps to exactly one unique word in s.
  • Each unique word in s maps to exactly one letter in pattern.
  • No two letters map to the same word, and no two words map to the same letter.

Example 1:

Input: pattern = "abba", s = "dog cat cat dog"

Output: true

Explanation:

The bijection can be established as:

  • 'a' maps to "dog".
  • 'b' maps to "cat".

Example 2:

Input: pattern = "abba", s = "dog cat cat fish"

Output: false

Example 3:

Input: pattern = "aaaa", s = "dog cat cat dog"

Output: false

Constraints:

  • 1 <= pattern.length <= 300
  • pattern contains only lower-case English letters.
  • 1 <= s.length <= 3000
  • s contains only lowercase English letters and spaces ' '.
  • s does not contain any leading or trailing spaces.
  • All the words in s are separated by a single space.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the pattern string contain characters other than lowercase English letters?
  2. Can the input string `s` contain leading or trailing spaces, and how should I handle them?
  3. Is `s` guaranteed to contain only words separated by single spaces, or could there be multiple spaces between words?
  4. If the length of the pattern and the number of words in `s` don't match, should I return false immediately?
  5. Is the pattern and the words in `s` case sensitive?

Brute Force Solution

Approach

The brute force approach to the Word Pattern problem involves trying every possible association between letters in the pattern and words in the string. We aim to find one valid mapping where each letter corresponds to exactly one word and each word corresponds to exactly one letter. We exhaustively check each possible pairing until we either find a match or run out of possibilities.

Here's how the algorithm would work step-by-step:

  1. Start by picking the first letter in the pattern and trying to match it with the first word in the string.
  2. Then, pick the second letter in the pattern and try matching it with the next available word in the string, making sure this new match doesn't conflict with any previous matches.
  3. Continue matching letters and words, one at a time, always checking for conflicts with existing matches.
  4. If you reach a point where you can't find a valid word for a letter or a letter for a word without causing a conflict, backtrack. This means you undo the last match you made and try a different word for that letter.
  5. If you reach the end of both the pattern and the string without any conflicts, you've found a valid match! Return that this is a match.
  6. If you have exhausted all possible combinations of letters and words without finding a valid match, then it's not a pattern match.

Code Implementation

def word_pattern(pattern, string):
    words = string.split()
    if len(pattern) != len(words):
        return False

    letter_to_word = {}
    word_to_letter = {}

    # Iterate through each letter/word to check for conflicts
    for i, letter in enumerate(pattern):
        word = words[i]
        # Check if the letter already maps to a different word
        if letter in letter_to_word:
            if letter_to_word[letter] != word:
                return False
        # Check if the word already maps to a different letter
        elif word in word_to_letter:
            if word_to_letter[word] != letter:
                return False
        # If no conflict, create the new mapping
        else:
            letter_to_word[letter] = word
            word_to_letter[word] = letter

    return True

Big(O) Analysis

Time Complexity
O((W^P) * P)The brute force approach explores every possible mapping between pattern letters and words, which essentially tries assigning each of the P pattern letters to one of W words, where W is the number of words. This leads to W choices for the first pattern letter, W for the second, and so on, giving W^P possible mappings. For each mapping, we verify if it's consistent, which involves at most comparing all the letter-word mappings, taking O(P) time. Therefore, in the worst case, the time complexity is O((W^P) * P).
Space Complexity
O(N)The brute force approach, as described, relies on backtracking. The primary driver of space complexity is the recursion depth, which can grow proportionally to the length of the pattern (or the number of words, whichever is smaller). In the worst-case scenario, where each match attempt leads to a conflict requiring backtracking, the recursion stack will hold at most N frames, where N represents the length of the pattern. Additionally, hash maps (or similar data structures) are implicitly used to track the letter-to-word and word-to-letter mappings, potentially storing up to N mappings. Therefore, the auxiliary space is O(N).

Optimal Solution

Approach

The goal is to check if a pattern (like 'abba') matches the structure of a string of words (like 'dog cat cat dog'). We want to ensure each letter in the pattern consistently corresponds to a specific word, and vice versa. This is done by keeping track of the relationships between letters and words, ensuring there are no contradictions.

Here's how the algorithm would work step-by-step:

  1. Split the string of words into individual words.
  2. Check if the number of letters in the pattern matches the number of words. If they don't match, it's impossible to have a valid pattern.
  3. Create two dictionaries (or similar lookup structures). One will map letters to words, and the other will map words to letters.
  4. Go through the pattern and the words together, one by one.
  5. For each letter, check if it already has a corresponding word in the letter-to-word dictionary. If not, add the letter and the corresponding word to the dictionary.
  6. If the letter *does* already have a word in the dictionary, make sure it's the *same* word as the current word we're looking at. If not, the pattern doesn't match.
  7. Do the same thing in reverse: for each word, check if it already has a corresponding letter in the word-to-letter dictionary. If not, add the word and the corresponding letter to the dictionary.
  8. If the word *does* already have a letter in the dictionary, make sure it's the *same* letter as the current letter we're looking at. If not, the pattern doesn't match.
  9. If you make it through the entire pattern and word list without finding any mismatches, then the pattern matches the string of words!

Code Implementation

def word_pattern(pattern, input_string): 
    words = input_string.split()

    if len(pattern) != len(words):
        return False

    letter_to_word = {}
    word_to_letter = {}

    for i in range(len(pattern)): 
        letter = pattern[i]
        word = words[i]

        # Ensures pattern and string are consistent
        if letter in letter_to_word: 
            if letter_to_word[letter] != word:
                return False
        else:
            letter_to_word[letter] = word

        # Ensures string and pattern are consistent
        if word in word_to_letter:
            if word_to_letter[word] != letter:
                return False
        else:
            word_to_letter[word] = letter

    return True

Big(O) Analysis

Time Complexity
O(n)The primary operation involves iterating through the pattern and the array of words once. Let n be the length of the pattern (which is equal to the number of words after splitting the input string). Inside the loop, dictionary lookups and insertions take O(1) time on average. Therefore, the dominant cost is determined by the single iteration over n elements, leading to a time complexity of O(n).
Space Complexity
O(M)The primary space complexity comes from the two dictionaries (or similar lookup structures) used to map letters to words and words to letters. In the worst-case scenario, where each letter in the pattern maps to a unique word and each word maps to a unique letter, the dictionaries can store up to M key-value pairs, where M is the number of unique words in the input string. Therefore, the space used by these dictionaries is proportional to the number of unique words. Other variables used have constant space usage.

Edge Cases

CaseHow to Handle
Empty pattern or empty string sReturn true if both pattern and s are empty, and false if only one is empty, as that means they cannot match the defined pattern.
Pattern is longer than the number of words in sReturn false because there aren't enough words in s to correspond to each character in the pattern.
Number of words in s is greater than the length of the patternReturn false because there are too many words in s, and they can't possibly match the pattern exactly.
One pattern character maps to multiple different words in sThe hash map should prevent this; if we try to map the same pattern character to a different word, return false.
One word in s maps to multiple different characters in patternUse a second hash map for reverse mapping to ensure that each word maps back to a unique character, otherwise, return false.
Pattern contains duplicate characters, but s contains unique words.The solution must allow the same character in the pattern to map to the same word in s, while maintaining the one-to-one correspondence.
s contains duplicate words, but pattern contains unique characters.The solution must allow different characters in the pattern to map to different words in s while ensuring words map back to unique characters.
Very long pattern and string (scaling concerns)The hash map approach has a time complexity of O(N) where N is the maximum length of the pattern or s, making it scalable even with large inputs.