Taro Logo

Word Ladder II

Medium
4 views
2 months ago

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that:

  1. Every adjacent pair of words differs by a single letter.
  2. Every si for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList.
  3. sk == endWord

Given two words, beginWord and endWord, and a dictionary wordList, return all the shortest transformation sequences from beginWord to endWord, or an empty list if no such sequence exists. Each sequence should be returned as a list of the words [beginWord, s1, s2, ..., sk].

For example:

Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"] Output: [["hit","hot","dot","dog","cog"],["hit","hot","lot","log","cog"]]

There are 2 shortest transformation sequences: "hit" -> "hot" -> "dot" -> "dog" -> "cog" "hit" -> "hot" -> "lot" -> "log" -> "cog"

Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"] Output: []

Can you provide a Python function that implements this and explain the time and space complexity?

Sample Answer
from collections import defaultdict

def findLadders(beginWord: str, endWord: str, wordList: list[str]) -> list[list[str]]:
    # Convert wordList to a set for faster lookup
    wordList = set(wordList)
    if endWord not in wordList:
        return []

    # Build the adjacency list (graph) for words that differ by one letter
    adj = defaultdict(list)
    all_words = wordList | {beginWord}
    for word1 in all_words:
        for word2 in all_words:
            if word1 != word2 and differ_by_one_letter(word1, word2):
                adj[word1].append(word2)

    # BFS to find the shortest paths
    queue = [(beginWord, [beginWord])]
    visited = {beginWord}
    shortest_length = float('inf')
    result = []

    while queue:
        curr_word, path = queue.pop(0)

        if len(path) > shortest_length:
            break

        if curr_word == endWord:
            shortest_length = len(path)
            result.append(path)
            continue

        for neighbor in adj[curr_word]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append((neighbor, path + [neighbor]))

    # Filter result to only include shortest paths
    shortest_paths = []
    for path in result:
        if len(path) == shortest_length:
            shortest_paths.append(path)

    return shortest_paths


def differ_by_one_letter(word1: str, word2: str) -> bool:
    if len(word1) != len(word2):
        return False
    diff_count = 0
    for i in range(len(word1)):
        if word1[i] != word2[i]:
            diff_count += 1
    return diff_count == 1

# Example usage
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
print(findLadders(beginWord, endWord, wordList))

beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]
print(findLadders(beginWord, endWord, wordList))

Naive Approach

The most straightforward approach would be to perform a Breadth-First Search (BFS) on the graph of words. Each word represents a node, and an edge exists between two words if they differ by only one letter. The BFS would start at the beginWord and explore layer by layer until it finds the endWord. This approach finds all possible paths but may not be efficient in finding the shortest paths without further optimization.

Optimal Approach

The optimized approach still uses BFS but incorporates several key improvements:

  1. Precompute Adjacency List: Build an adjacency list (graph) representing connections between words that differ by one letter. This avoids redundant computations during the BFS.
  2. Early Termination: During BFS, maintain the shortest length found so far. If the current path's length exceeds the shortest length, terminate the exploration of that path.
  3. Track Visited Words: Keep track of visited words to avoid cycles and redundant paths, but ensure that words can be revisited if they lead to a shorter path.
  4. Filter Shortest Paths: After the BFS, filter the results to only include paths that have the shortest length.

Big(O) Run-time Analysis

  • Building Adjacency List: For each pair of words, we compare them, which takes O(L) time, where L is the length of the word. If there are N words in the wordList, this step takes O(N^2 * L) time.
  • BFS: In the worst case, the BFS might visit all possible paths. The time complexity of BFS is O(V + E), where V is the number of vertices (words) and E is the number of edges (connections between words). In the worst case, E can be O(N^2), so BFS is O(N + N^2) which simplifies to O(N^2). Each path construction takes O(L) since we're copying each word. The total time is therefore bounded by O(N^2 * L) in the worst case, if we consider that we extract all shortest paths.
  • Total: The overall time complexity is dominated by building the adjacency list and the BFS, which is O(N^2 * L).

Big(O) Space Usage Analysis

  • Adjacency List: The adjacency list stores connections between words. In the worst case, each word might be connected to every other word, leading to O(N^2) space.
  • BFS Queue: The queue used in BFS can, in the worst case, contain all the words, leading to O(N) space.
  • Visited Set: The visited set stores words that have been visited, requiring O(N) space.
  • Result List: The result list stores the shortest paths. In the worst case, there might be multiple shortest paths, and each path can be of length N (number of words). This could lead to O(N^2) space if we have N paths of length N.
  • Total: The overall space complexity is O(N^2) due to the adjacency list and the potential size of the result list.

Edge Cases

  1. Empty Word List: If the word list is empty, there can be no transformation, so return an empty list.
  2. endWord Not in wordList: If the endWord is not in the wordList, there can be no transformation, so return an empty list.
  3. No Transformation Possible: If there is no possible transformation from beginWord to endWord, the BFS will terminate without finding any path, and an empty list should be returned.
  4. beginWord and endWord are the Same: If beginWord and endWord are identical, and the word list contains this word, return a list containing a list with just the beginWord (or endWord). If the word list does not contain the beginWord, there's still no transformation, return an empty list.
  5. Large Word List: Handle cases where the word list is very large, and memory or time optimizations become critical. The current solution is reasonably efficient but can be further optimized by using bidirectional BFS or more memory-efficient data structures if needed.