Given a string s
and a dictionary of strings wordDict
, add spaces in s
to construct a sentence where each word is a valid dictionary word. Return all such possible sentences in any order.
Note that the same word in the dictionary may be reused multiple times in the segmentation.
Example 1:
Input: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"] Output: ["cats and dog","cat sand dog"]
Example 2:
Input: s = "pineapplepenapple", wordDict = ["apple","pen","applepen","pine","pineapple"] Output: ["pine apple pen apple","pineapple pen apple","pine applepen apple"] Explanation: Note that you are allowed to reuse a dictionary word.
Example 3:
Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"] Output: []
Constraints:
1 <= s.length <= 20
1 <= wordDict.length <= 1000
1 <= wordDict[i].length <= 10
s
and wordDict[i]
consist of only lowercase English letters.wordDict
are unique.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force strategy explores every possible way to break the given string into valid words from the dictionary. We will recursively try all combinations, building sentences until we either find a valid sentence or exhaust all possibilities. It's like trying every single path until we stumble upon the correct one(s).
Here's how the algorithm would work step-by-step:
def word_break_brute_force(input_string, word_dictionary):
results = []
def backtrack(start_index, current_sentence):
# If we've reached the end of the string, we've found a valid sentence
if start_index == len(input_string):
results.append(" ".join(current_sentence))
return
for end_index in range(start_index + 1, len(input_string) + 1):
word = input_string[start_index:end_index]
# Check if the substring is a valid word.
if word in word_dictionary:
# Add the valid word to the current sentence
current_sentence.append(word)
# Recursively call backtrack to explore the remaining string
backtrack(end_index, current_sentence)
# Backtrack: Remove the last word to explore other possibilities
current_sentence.pop()
backtrack(0, [])
return results
The goal is to break a long string into a sequence of valid words from a dictionary and return all possible sentences. The optimal solution uses a technique that avoids recomputing results for overlapping subproblems, significantly speeding up the process. It efficiently builds valid sentences by remembering which parts of the string can be broken down further.
Here's how the algorithm would work step-by-step:
def word_break_ii(long_string, word_dictionary):
memo = {len(long_string): ['']}
def word_break_recursive(start_index):
if start_index in memo:
return memo[start_index]
sentences = []
# Iterate through possible word endings.
for end_index in range(start_index + 1, len(long_string) + 1):
word = long_string[start_index:end_index]
# Check if the current substring is a valid word.
if word in word_dictionary:
# Recursively break down the remaining string.
remaining_sentences = word_break_recursive(end_index)
# Build sentences by combining the current word with the remaining sentences.
for sentence in remaining_sentences:
if sentence:
sentences.append(word + ' ' + sentence)
else:
sentences.append(word)
memo[start_index] = sentences
return sentences
# Initiate the recursive process starting from index 0.
result = word_break_recursive(0)
return result
Case | How to Handle |
---|---|
Empty string s | If s is empty and wordDict is empty return empty list, if s is empty and wordDict is not empty, return empty list, if s is empty and wordDict contains the empty string, return list containing the empty string. |
Empty wordDict | If wordDict is empty, and s is non-empty, return empty list, as no segmentation is possible. |
s is a single character not in wordDict | Return an empty list as no segmentation is possible. |
s is a single word in wordDict | Return a list containing only s. |
s is a long string and no valid segmentation exists | The algorithm should terminate gracefully and return an empty list without excessive recursion depth. |
wordDict contains very long words not found in s | The algorithm should efficiently skip these words during segmentation attempts. |
wordDict contains duplicate words | Duplicates in wordDict should not affect the segmentation result; the algorithm should still find all valid segmentations. |
s is very long, and there are many valid segmentations (potential for stack overflow with naive recursion) | Use memoization (dynamic programming) to avoid recomputing solutions for substrings and improve efficiency, limiting recursion depth. |