Taro Logo

Word Ladder

Hard
Amazon logo
Amazon
2 views
Topics:
GraphsStrings

Given two words, beginWord and endWord, and a dictionary wordList, determine the shortest transformation sequence from beginWord to endWord. The transformation must adhere to these rules:

  1. Only one letter can be changed at a time.
  2. Each transformed word must exist in the wordList (except for the beginWord).

Return the number of words in the shortest transformation sequence, or 0 if no such sequence exists.

For example:

  • beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"] Output: 5 Explanation: One shortest transformation sequence is "hit" -> "hot" -> "dot" -> "dog" -> "cog", which is 5 words long.

  • beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"] Output: 0 Explanation: The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.

Write a function to efficiently solve this problem.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Are the words in the dictionary all the same length as the startWord and endWord?
  2. Is the dictionary guaranteed to contain only lowercase alphabetic characters?
  3. If no transformation sequence exists, should I return 0, or is there another specified return value?
  4. Is the startWord guaranteed to be different from the endWord?
  5. Does the dictionary contain the startWord and/or the endWord initially?

Brute Force Solution

Approach

The brute force approach to the Word Ladder problem is like trying every single possible path between the start and end word. We explore all possible word sequences, checking if each word differs by only one letter from the previous word. We continue until we either find the end word or have exhausted all possibilities.

Here's how the algorithm would work step-by-step:

  1. Start with the start word.
  2. Look at all the words in the dictionary and see which ones differ from the current word by only one letter.
  3. For each of these one-letter-different words, consider it as the next word in a possible sequence.
  4. Repeat the process for each of these 'next' words, finding all the one-letter-different words from them, and so on.
  5. Keep going until you find a sequence that leads to the end word. If you find it, you have a possible solution.
  6. Because we want the shortest sequence, also remember to check all the other possible paths and keep the shortest one.
  7. If you've checked all possible word sequences and haven't found the end word, there's no solution.

Code Implementation

def word_ladder_brute_force(begin_word, end_word, word_list):
    word_set = set(word_list)
    if end_word not in word_set:
        return 0

    queue = [(begin_word, 1)]
    visited = {begin_word}

    while queue:
        current_word, current_length = queue.pop(0)

        if current_word == end_word:
            return current_length

        # Iterate through all possible next words
        for next_word in word_list:
            if next_word not in word_set:
                continue

            if next_word in visited:
                continue

            difference_count = 0
            for i in range(len(current_word)):
                if current_word[i] != next_word[i]:
                    difference_count += 1

            # Only proceed if words are 1 character different
            if difference_count == 1:
                queue.append((next_word, current_length + 1))
                visited.add(next_word)

    return 0

Big(O) Analysis

Time Complexity
O((26^L) * N)In the worst-case brute force scenario, where L is the length of the words and N is the number of words in the wordList, we explore every possible word of length L formed by changing one letter at a time. Each word position has 26 possibilities which result in 26^L maximum possibilities. Then, for each generated word, we must check if it exists in the wordList. That membership check is O(N). This results in a time complexity of O((26^L) * N). Because it does not account for the best case where we may find a solution earlier, the worst-case is that we traverse all possibilities.
Space Complexity
O(N)The brute force approach to Word Ladder described maintains a queue (implicitly in steps 2-4) to explore possible word sequences. In the worst-case scenario, this queue could contain all the words in the dictionary except the start word. Therefore, the auxiliary space used by the queue is proportional to the number of words in the dictionary, denoted as N. Additionally, a data structure (such as a set) may be needed to track visited words to avoid cycles, which could also grow up to size N in the worst case. Hence the space complexity is O(N).

Optimal Solution

Approach

The problem is to find the shortest sequence of words that transforms one word into another, changing only one letter at a time. We'll use a method that explores all possible paths simultaneously, branching out from the starting word until we find the target word, ensuring the shortest route is found first.

Here's how the algorithm would work step-by-step:

  1. Start with the initial word and envision it as the starting point of our search.
  2. Consider all possible one-letter changes that can be made to the current word.
  3. Check if these changed words are valid words in our word list and haven't already been explored.
  4. If a valid, unexplored word is found, consider it as the next step in a potential sequence.
  5. Repeat this process for each newly discovered word, exploring all its possible one-letter transformations.
  6. Continue until the target word is found, or until all possible paths have been explored.
  7. The first sequence that reaches the target word is the shortest possible sequence.
  8. If no sequence is found, it means no such transformation is possible.

Code Implementation

from collections import deque

def word_ladder(begin_word, end_word, word_list):
    word_set = set(word_list)
    if end_word not in word_set:
        return 0

    queue = deque([(begin_word, 1)])
    visited = {begin_word}

    while queue:
        current_word, path_length = queue.popleft()

        if current_word == end_word:
            return path_length

        # Iterate through all possible one-letter variations
        for i in range(len(current_word)):
            for char_code in range(ord('a'), ord('z') + 1):
                new_char = chr(char_code)
                new_word = current_word[:i] + new_char + current_word[i+1:]

                # Check for valid words that can be enqueued
                if new_word in word_set and new_word not in visited:
                    # Prevent cycles by tracking visited words
                    visited.add(new_word)

                    queue.append((new_word, path_length + 1))

    return 0

Big(O) Analysis

Time Complexity
O(M² * N)Let M be the length of each word and N be the size of the word list. For each word visited in the breadth-first search (BFS), we iterate through all possible one-letter changes, which takes O(M) time (trying M positions, each with 26 possible characters). For each of these M possible words, we must check if it's in the word list, which could take O(N) time using a linear search or O(log N) with a set. However, within the BFS, in the worst case, we might enqueue all N words. In the worst case scenario the algorithm could iterate through all possible words in the dictionary multiple times while exploring different paths, but the single one-letter changes within each word takes O(M) time and checking against the dictionary takes O(N). Thus, the total time complexity can be expressed as O(M² * N), which is due to creating all possible one-letter changes and verifying their existence in the wordList.
Space Complexity
O(N)The algorithm utilizes a queue to store words to explore, and in the worst case, this queue could contain all the words in the word list. We also need a set or similar data structure (implicitly described as 'keeping track of visited locations') to store the visited words to avoid cycles, which, in the worst case, will also contain all the words. Therefore, the auxiliary space used is proportional to the number of words in the word list, N, where N is the total number of words in the input dictionary. This means the algorithm has a space complexity of O(N).

Edge Cases

CaseHow to Handle
startWord or endWord is null or emptyReturn 0 immediately as no valid transformation is possible.
dictionary is null or emptyIf endWord equals startWord, return 1, otherwise return 0 as no transformation is possible.
startWord and endWord are the sameReturn 1 if startWord equals endWord as no transformation is needed.
endWord is not present in the dictionaryReturn 0 as a transformation to a non-existent word is impossible.
dictionary contains words of different lengthsThe one-letter-difference check will effectively ignore words of differing length; continue search with valid length words only.
No transformation sequence exists between startWord and endWordBFS search will exhaust all possibilities and return 0 when the queue is empty and the endWord hasn't been reached.
Dictionary contains duplicate wordsTreat duplicates as a single entry as they offer no benefit to the shortest path.
Large dictionary size could lead to memory or time complexity issues with BFS.Ensure algorithm uses efficient data structures and early exit conditions to minimize the search space.