Taro Logo

Word Ladder

Hard
Reddit logo
Reddit
7 views
Topics:
ArraysGraphs

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that:

  • Every adjacent pair of words differs by a single letter.
  • Every si for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList.
  • sk == endWord

Given two words, beginWord and endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence from beginWord to endWord, or 0 if no such sequence exists.

Example 1:

Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]
Output: 5
Explanation: One shortest transformation sequence is "hit" -> "hot" -> "dot" -> "dog" -> cog", which is 5 words long.

Example 2:

Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]
Output: 0
Explanation: The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.

Constraints:

  • 1 <= beginWord.length <= 10
  • endWord.length == beginWord.length
  • 1 <= wordList.length <= 5000
  • wordList[i].length == beginWord.length
  • beginWord, endWord, and wordList[i] consist of lowercase English letters.
  • beginWord != endWord
  • All the words in wordList are unique.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the words in `startWord`, `endWord`, and `wordList`, and what is the maximum number of words in `wordList`?
  2. Is the `wordList` guaranteed to contain only words of the same length as `startWord` and `endWord`?
  3. Are all words in lowercase, or can they contain uppercase letters or other characters?
  4. If there is no transformation sequence, should I return 0, or is there a different value I should return to indicate no path exists?
  5. Is it possible for `startWord` and `endWord` to be the same, and if so, what should the output be?

Brute Force Solution

Approach

The brute force strategy for the Word Ladder problem involves exploring every possible sequence of words to find the shortest transformation path. We start with the beginning word and try every possible one-letter change. We continue this process until we arrive at the ending word.

Here's how the algorithm would work step-by-step:

  1. Start with the beginning word.
  2. Consider every word in the given word list.
  3. For each word, check if it differs from the current word by only one letter.
  4. If it does, consider this a possible next word in our sequence.
  5. Keep trying every single word until you find the ending word, or you run out of possibilities.
  6. Keep a record of every path you tried, and how long each one was.
  7. Once you've tried every path, pick the shortest one to the ending word. If no paths exist, then no solution exists.

Code Implementation

def word_ladder_brute_force(begin_word, end_word, word_list):
    shortest_path_length = float('inf')
    all_paths = []

    def find_paths(current_word, current_path):
        nonlocal shortest_path_length

        current_path = current_path + [current_word]

        # If we found the end word, record path.
        if current_word == end_word:
            all_paths.append(current_path)
            shortest_path_length = min(shortest_path_length, len(current_path))
            return

        # Iterate over available words
        for next_word in word_list:
            if next_word not in current_path:
                difference_count = 0
                for i in range(len(current_word)):
                    if current_word[i] != next_word[i]:
                        difference_count += 1

                # Check if the next word is only one letter different
                if difference_count == 1:

                    find_paths(next_word, current_path)

    find_paths(begin_word, [])

    # If no path was found, the shortest path length will be infinity
    if shortest_path_length == float('inf'):
        return 0
    else:
        # Subtract one to get number of transformations instead of word count
        return shortest_path_length

Big(O) Analysis

Time Complexity
O(n!)The described brute force approach explores every possible sequence of words. In the worst-case scenario, we might consider every permutation of the word list. Starting with the beginWord, we explore all possible one-letter changes from the word list. This branching continues recursively for each valid word, potentially visiting all words in the list in different orders. The number of possible paths grows factorially with the number of words in the word list (n), leading to a time complexity of O(n!).
Space Complexity
O(N!)The brute force approach involves exploring every possible sequence of words, which can lead to storing a large number of paths. The explanation describes keeping a record of every path tried. In the worst-case scenario, the algorithm might explore all permutations of the word list to find a path to the ending word. Therefore, the space required to store these paths grows factorially with the number of words (N) in the word list, leading to a space complexity of O(N!).

Optimal Solution

Approach

The most efficient way to solve this puzzle is to explore words layer by layer, starting from the beginning word. We want to discover the shortest path to the end word by only changing one letter at a time. This is done by exploring words close to the current word before moving further away.

Here's how the algorithm would work step-by-step:

  1. Begin with the starting word and consider all words in the word list that differ by only one letter.
  2. Place these 'one-letter-different' words in a queue to explore later.
  3. Now, take the first word from the queue. Find *its* one-letter-different words, and add them to the queue if they haven't been explored yet.
  4. Repeat this process, step by step, like ripples spreading across water. Each ripple represents words one letter different from the previous ripple.
  5. Keep track of how many steps it takes to reach each word from the starting word.
  6. When you find the target word, the number of steps it took represents the shortest word ladder length.
  7. Also, make sure you don't revisit the same word twice; that would lead you down longer, unnecessary paths. Remove words from the original word list as you visit them to avoid this.

Code Implementation

from collections import deque

def word_ladder(begin_word, end_word, word_list):
    word_set = set(word_list)
    if end_word not in word_set:
        return 0

    queue = deque([(begin_word, 1)])
    visited = {begin_word}

    while queue:
        current_word, path_length = queue.popleft()

        if current_word == end_word:
            return path_length

        for i in range(len(current_word)):
            for char_code in range(ord('a'), ord('z') + 1):
                new_char = chr(char_code)
                new_word = current_word[:i] + new_char + current_word[i+1:]

                # Prevents cycles and redundant paths
                if new_word in word_set and new_word not in visited:

                    queue.append((new_word, path_length + 1))
                    visited.add(new_word)

    return 0

Big(O) Analysis

Time Complexity
O(M²N)The time complexity is driven by the breadth-first search (BFS). In the worst case, we might need to explore all possible words. Let N be the number of words in the wordList and M be the length of each word. For each word in the queue (up to N words), we iterate through the entire wordList again to find one-letter-different words, which takes O(N) time. Finding one-letter-different words involves comparing each word which takes O(M) time. Thus, within the BFS loop, finding the neighbors dominates as O(M*N) in the worst case. Since we potentially explore each word, the overall complexity becomes O(M*N * N) or O(M*N²). The average word length (M) can be significant, so O(M * N^2) is a reasonable reflection of what drives the cost.
Space Complexity
O(N)The algorithm uses a queue to store words to explore, and in the worst-case scenario, the queue could contain all the words in the word list. The algorithm also modifies the original word list by removing visited words to prevent revisits which means, at worst, we iterate through all words. Let N be the number of words in the word list; therefore, the space complexity is dominated by the queue and the modified word list, leading to O(N) auxiliary space.

Edge Cases

CaseHow to Handle
startWord is null or emptyReturn 0 immediately as no transformation can start without a start word.
endWord is null or emptyReturn 0 immediately as no transformation can end without an end word.
wordList is null or emptyReturn 0 immediately if wordList is empty and startWord != endWord, as there's nothing to transform to; if startWord == endWord, return 1.
startWord is equal to endWordReturn 1 immediately as no transformation is needed.
endWord is not present in the wordListReturn 0 immediately as the transformation is impossible.
wordList contains duplicate wordsThe algorithm treats duplicates as single entries, which doesn't affect correctness since we only need to know if a word is reachable, and duplicates don't change reachability.
No transformation sequence exists between startWord and endWordThe BFS will exhaust all possible paths and return 0 when the endWord is not found.
Integer overflow if wordList is extremely large and transformation sequence is longUse appropriate data types (e.g., long) for tracking transformation length to avoid potential integer overflow.