Taro Logo

Number of Ways to Form a Target String Given a Dictionary

Hard
Google logo
Google
2 views
Topics:
Dynamic ProgrammingStrings

You are given a list of strings of the same length words and a string target. Your task is to form target using the given words under the following rules:

  1. target should be formed from left to right.
  2. To form the i<sup>th</sup> character (0-indexed) of target, you can choose the k<sup>th</sup> character of the j<sup>th</sup> string in words if target[i] = words[j][k].
  3. Once you use the k<sup>th</sup> character of the j<sup>th</sup> string of words, you can no longer use the x<sup>th</sup> character of any string in words where x <= k. In other words, all characters to the left of or at index k become unusable for every string.
  4. Repeat the process until you form the string target.

Notice that you can use multiple characters from the same string in words provided the conditions above are met.

Return the number of ways to form target from words. Since the answer may be too large, return it modulo 109 + 7.

For example:

words = [“acca”,“bbbb”,“caca”], target = “aba”

Output: 6

words = [“abba”,“baab”], target = “bab”

Output: 4

Explain an efficient algorithm to solve this problem. What are the time and space complexities? Consider edge cases.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the length of the `target` string and the words in the `words` array? Specifically, what is the maximum length for each?
  2. Can the `words` array be empty? Can any word in `words` be an empty string? What should I return if the `target` string is empty?
  3. Are the strings in `words` and `target` guaranteed to contain only lowercase English letters, or can they contain other characters?
  4. If there are multiple ways to form the target string, and the number of ways is very large, is there a maximum value I should return, or a specific modulo to use to prevent integer overflow?
  5. If a word in `words` can be used multiple times to form the `target`, is that allowed?

Brute Force Solution

Approach

The brute force approach involves trying every possible combination of characters from the given word list to build the target string. We explore all paths, one character at a time, to see if we can construct the target. We count the number of successful paths.

Here's how the algorithm would work step-by-step:

  1. Start with the first character of the target string.
  2. For that character, go through each word in the dictionary.
  3. If a word has the character in its first column, consider that word to be the starting point for one possible way to create the string.
  4. If a word has the character in its second column, consider that word to be the starting point for one possible way to create the string.
  5. Repeat this for all columns of each word in the dictionary.
  6. Now, move to the second character of the target string.
  7. For each starting word, repeat the same process of finding words that have the second character in the correct columns.
  8. Keep doing this for each character in the target string, building all possible paths.
  9. If a path successfully builds the entire target string, increase the count.
  10. After exploring all paths, the final count represents the total number of ways to form the target string.

Code Implementation

def number_of_ways_brute_force(
    word_list, target_string
):
    number_of_ways = 0

    def backtrack(
        target_index,
        current_path,
    ):
        nonlocal number_of_ways

        # If we've built the entire target string,
        # increment the count
        if target_index == len(target_string):
            number_of_ways += 1
            return

        for word in word_list:
            # Iterate through each column of the word
            for column_index in range(len(word)):
                # Check if the character at
                # current column matches
                if (
                    column_index < len(word)
                    and target_string[target_index] == word[column_index]
                ):

                    # Explore the next character
                    backtrack(
                        target_index + 1,
                        current_path + [word[column_index]],
                    )

    backtrack(0, [])

    return number_of_ways

Big(O) Analysis

Time Complexity
O(M * N * L)The brute force approach explores all possible paths to construct the target string. The primary driver of complexity is the nested iteration through the target string's characters (N), the words in the dictionary (M), and the columns in each word (L). For each character in the target, we examine each word in the dictionary and each column within that word to see if it matches. Therefore, the time complexity is proportional to M * N * L, where M is the number of words in the dictionary, N is the length of the target string, and L is the maximum length of a word in the dictionary.
Space Complexity
O(M*T)The dominant space complexity comes from the implicit recursion stack. The maximum depth of the recursion is the length of the target string (T). At each level of recursion, we are iterating through the columns of the words in the dictionary. Given the word list, the number of columns we can visit is at most the length of the longest word. Therefore, the maximum number of active stack frames can grow to the target string length T, and at each level, we potentially have space to consider all columns M. This leads to a space complexity of O(M*T), where M is the maximum length of a word in the dictionary, and T is the length of the target string. Therefore the space is bounded by O(M*T).

Optimal Solution

Approach

We want to count how many ways we can build our target string by picking characters from the given dictionary words. The clever trick is to use a system where we remember results we've already calculated, so we don't repeat the same work over and over. This makes the whole process much faster.

Here's how the algorithm would work step-by-step:

  1. First, figure out how many times each letter appears at each position across all the dictionary words. This will help us quickly see our options.
  2. Now, we'll build the target string one character at a time. Start with the first character.
  3. For each character in the target string, go through the dictionary words and see if that character appears at the right position.
  4. If it does, that means we can use that dictionary word's character to build the target string. We need to remember how many ways we can build the rest of the target string starting from the next character, after using this dictionary word's character.
  5. This is the key part: if we've already calculated the number of ways to build the rest of the target string from a certain point, we just look up that answer instead of recalculating it. This is called remembering intermediate results.
  6. We keep doing this until we reach the end of the target string. The final result is the total number of ways we found to build the whole target string from the dictionary words.

Code Implementation

def number_of_ways_to_form_target(
    dictionary_words, target_string):

    word_length = len(dictionary_words[0])
    target_length = len(target_string)
    modulo = 10**9 + 7

    # Stores char frequencies at each index.
    character_counts = [([0] * 26) for _ in range(word_length)]

    for word in dictionary_words:
        for index, char in enumerate(word):
            character_counts[index][ord(char) - ord('a')] += 1

    memo = {}

    def calculate_ways(index_in_target, index_in_word):
        if index_in_target == target_length:
            return 1

        if index_in_word == word_length:
            return 0

        if (index_in_target, index_in_word) in memo:
            return memo[(index_in_target, index_in_word)]

        ways = calculate_ways(index_in_target,
                              index_in_word + 1) % modulo

        target_char_index =
        ord(target_string[index_in_target]) - ord('a')

        #If character can form target.
        if character_counts[index_in_word][target_char_index] > 0:

            ways += (calculate_ways(index_in_target + 1,
                                     index_in_word + 1) *
                     character_counts[index_in_word][target_char_index])
            ways %= modulo

        memo[(index_in_target, index_in_word)] = ways
        return ways

    # Initiate recursive calls.
    result = calculate_ways(0, 0)
    return result

Big(O) Analysis

Time Complexity
O(m*n)The solution involves calculating character counts, which takes O(L*W) where L is the number of words in the dictionary and W is the length of the words (assumed to be bounded by the total number of chars). The core of the algorithm uses dynamic programming where we have a table of size m*n (target length * word length). Filling each cell involves a constant time lookup of letter counts. Thus, the overall complexity is dominated by the dynamic programming which is O(m*n). The term L*W is less significant when compared to m*n in most cases, so the overall runtime can be approximated to O(m*n).
Space Complexity
O(M*L)The auxiliary space is dominated by the memoization table used to store intermediate results. This table has dimensions corresponding to the length of the target string (L) and the maximum length of the words in the dictionary plus one (M), where the plus one is to deal with the indexing issues when we finish considering characters from our target string. Thus, we have a table of size approximately M*L. Other variables used have constant space and do not affect the overall space complexity.

Edge Cases

CaseHow to Handle
Empty words arrayIf the words array is empty, there are zero ways to form the target string; return 0.
Empty target stringIf the target string is empty, there is one way to form it (by selecting nothing); return 1.
Target string longer than the longest word multiplied by number of wordsIf the target string is longer than the maximum possible length that can be formed, return 0.
Words array contains empty stringsEmpty strings in words should be ignored or handled as characters that cannot contribute to forming the target.
Characters in target string not present in any of the wordsIf a character in the target string is not present in any of the given words at appropriate indices, the result should be 0.
Integer overflow when calculating number of waysUse modulo operation throughout the calculation to prevent integer overflow (e.g., modulo 10^9 + 7).
Words with varying lengthsHandle words of different lengths by ensuring only valid indices are accessed when matching characters.
Very large input sizes for words and target that can cause stack overflow during recursionUse dynamic programming (tabulation) instead of recursion to avoid stack overflow errors.