Taro Logo

Word Ladder II

Hard
Uber logo
Uber
0 views
Topics:
GraphsStrings

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that:

  1. Every adjacent pair of words differs by a single letter.
  2. Every si for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList.
  3. sk == endWord

Given two words, beginWord and endWord, and a dictionary wordList, return all the shortest transformation sequences from beginWord to endWord, or an empty list if no such sequence exists. Each sequence should be returned as a list of the words [beginWord, s1, s2, ..., sk].

For example:

Example 1:

Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]
Output: [["hit","hot","dot","dog","cog"],["hit","hot","lot","log","cog"]]

Explanation: There are 2 shortest transformation sequences:

"hit" -> "hot" -> "dot" -> "dog" -> "cog"

"hit" -> "hot" -> "lot" -> "log" -> "cog"

Example 2:

Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]
Output: []

Explanation: The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.

Explain your approach and provide the code.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Are all words in the wordList and the beginWord/endWord the same length? Are they all lowercase?
  2. What should I return if no transformation sequence exists between beginWord and endWord?
  3. Is the wordList guaranteed to contain only unique words, or can there be duplicates?
  4. Are beginWord and endWord guaranteed to be different, or could they be the same? Should I return an empty list in that case?
  5. If multiple shortest transformation sequences exist, can I return them in any order?

Brute Force Solution

Approach

The goal is to find all the shortest sequences of words that transform a starting word into an ending word, changing only one letter at a time. The brute force approach explores every possible word sequence until all valid transformations are found.

Here's how the algorithm would work step-by-step:

  1. Start with the start word.
  2. Consider every possible word from the given word list that differs from the current word by only one letter.
  3. For each of those words, check if it is the end word. If so, you've found a possible transformation sequence.
  4. If it's not the end word, add it to the current sequence and repeat the process by finding every possible one-letter-different word from the word list, again checking if any of these new words are the end word.
  5. Continue this process, exploring every possible sequence of words until all possible paths to the end word are found.
  6. Keep track of all the valid sequences from the start word to the end word.
  7. From all the valid sequences, select only those with the shortest length. These are your final solutions.

Code Implementation

def find_ladders_brute_force(start_word, end_word, word_list):
    all_sequences = []
    shortest_length = float('inf')

    def find_sequences(current_word, current_sequence):
        nonlocal shortest_length, all_sequences

        # If we've found the end word, add the sequence
        if current_word == end_word:
            if len(current_sequence) <= shortest_length:
                if len(current_sequence) < shortest_length:
                    all_sequences = []
                    shortest_length = len(current_sequence)
                all_sequences.append(current_sequence.copy())
            return

        # Explore neighbor words.
        for next_word in word_list:
            if differ_by_one_letter(current_word, next_word):

                #Avoid cycles by only exploring unvisited words
                if next_word not in current_sequence:
                    current_sequence.append(next_word)
                    find_sequences(next_word, current_sequence)
                    current_sequence.pop()

    def differ_by_one_letter(word1, word2):
        if len(word1) != len(word2):
            return False
        difference_count = 0
        for i in range(len(word1)):
            if word1[i] != word2[i]:
                difference_count += 1
        return difference_count == 1

    find_sequences(start_word, [start_word])
    return all_sequences

Big(O) Analysis

Time Complexity
O(N * 26^L)The brute force approach explores every possible word sequence up to a certain length. In the worst case, for each position in the word of length L, we could potentially change each letter to one of the 26 letters in the alphabet (or fewer if constraints exist). This happens at each level of the search, branching out to explore all possibilities. Since we are looking for the shortest sequences, the search space will eventually become very large, exploring almost all possible combinations before arriving at a valid or invalid result. In the worst-case scenario, the time complexity becomes exponential based on the length L of the words and number of words N. Considering all branches leads to O(N * 26^L) where N is the number of words and L is the word length.
Space Complexity
O(N^2)The algorithm explores all possible paths using a structure like a queue or list to store intermediate sequences. In the worst-case scenario, where almost every word can transform into many other words, the number of sequences stored simultaneously can grow exponentially, specifically if each word transforms into a large fraction of the total word list. If N is the number of words in the word list, the number of partial sequences stored can approach N! in the worst case, which makes an estimated upper bound of O(N^2) more accurate. The storage of the word list itself is considered input space, not auxiliary space, so sequences maintained are the primary auxiliary memory factor.

Optimal Solution

Approach

The Word Ladder II problem asks for all shortest transformation sequences between two words, changing one letter at a time, using a provided word list. The optimal approach leverages Breadth-First Search (BFS) to find shortest paths and keeps track of all possible paths that lead to a solution. This ensures that we only explore paths that are potentially the shortest, avoiding longer paths that would lead to unnecessary computation.

Here's how the algorithm would work step-by-step:

  1. Start by exploring outward from the beginning word, one letter change at a time.
  2. Imagine each word as a node in a graph and each possible one-letter change as a connection between words.
  3. Use a Breadth-First Search to explore this graph layer by layer.
  4. As you explore, keep track of the shortest distance from the beginning word to each word you encounter.
  5. If you find the ending word, stop exploring further away than that distance because those paths cannot be shortest.
  6. Also, store the path (or paths) that led you to each word, allowing you to reconstruct the transformation sequences later.
  7. During BFS, when expanding a word, consider only words from the word list that differ by only one letter and have not been visited at a shorter distance. Avoid visiting the same words multiple times at the same distance, this will help in eliminating duplicate paths.
  8. After the BFS is complete, trace back from the ending word to the beginning word using the stored paths to generate all the shortest transformation sequences.

Code Implementation

from collections import deque

def find_ladders(begin_word, end_word, word_list):
    word_set = set(word_list)
    if end_word not in word_set:
        return []

    neighbors = {}
    word_set.add(begin_word)
    for word in word_set:
        neighbors[word] = []

        for i in range(len(word)):
            for char_code in range(ord('a'), ord('z') + 1):
                new_char = chr(char_code)
                new_word = word[:i] + new_char + word[i+1:]
                if new_word in word_set and new_word != word:
                    neighbors[word].append(new_word)

    queue = deque([(begin_word, [begin_word])])
    visited = {begin_word: 1}
    result = []
    min_length = float('inf')

    while queue:
        current_word, current_path = queue.popleft()

        if len(current_path) > min_length:
            continue

        if current_word == end_word:
            min_length = len(current_path)

            # Only add the path if it's the shortest
            if len(current_path) == min_length:
                result.append(current_path)

            continue

        # Explore neighbors to discover new paths.
        for neighbor in neighbors[current_word]:
            if neighbor not in visited or visited[neighbor] >= len(current_path) + 1:

                # We use visited to avoid cycles and shorter paths
                visited[neighbor] = len(current_path) + 1
                queue.append((neighbor, current_path + [neighbor]))

    return result

Big(O) Analysis

Time Complexity
O(N * L^2 * |wordList|)The Breadth-First Search (BFS) explores the graph of word transformations. In the worst case, we might need to enqueue almost all words in the wordList, leading to a factor of |wordList|. For each word, we generate all possible one-letter variations, which takes O(L) where L is the length of the word. For each of these variations, we need O(L) to create the string. Also, for each valid word transformation we store the paths in a data structure. In the worst-case backtracking to construct the sequences can visit N words. Therefore, the overall time complexity becomes O(N * L^2 * |wordList|), where N represents the total number of shortest paths, L represents the length of each word, and |wordList| is the size of the word list.
Space Complexity
O(N)The primary space complexity stems from the Breadth-First Search (BFS). The queue used for BFS can, in the worst case, store all the words from the word list if they are all one transformation away from each other, resulting in O(N) space where N is the number of words in the word list. Additionally, the algorithm stores the paths (predecessors) for each word, which, in the worst case, can also take O(N) space to store all possible paths from the start word to other words. Therefore, the auxiliary space used is O(N) for the queue and path storage, giving a total space complexity of O(N).

Edge Cases

CaseHow to Handle
beginWord or endWord is null or emptyReturn an empty list if beginWord or endWord is null or empty as no transformation is possible.
wordList is null or emptyReturn an empty list if the wordList is null or empty, indicating no possible transformations.
beginWord is not in the wordListAdd beginWord to the wordList to allow it to be the start of a valid sequence, or return empty list if explicitly disallowed by requirements.
endWord is not in the wordListReturn an empty list since the endWord must be present to reach a valid word ladder.
beginWord and endWord are the sameIf beginWord and endWord are the same, and wordList contains beginWord, return a list containing only beginWord; otherwise, return an empty list.
No possible word ladder existsThe BFS will explore all possible paths, and if the endWord is never reached, an empty list will be returned.
Very large wordList causing memory issuesConsider using an iterative deepening depth-first search (IDDFS) to limit memory consumption if BFS becomes problematic with large word lists.
Word ladder length exceeds maximum allowed lengthIf there's a maximum allowed length specified, prune BFS paths exceeding this length to improve performance.