Given a string s
and a dictionary of strings wordDict
, add spaces in s
to construct a sentence where each word is a valid dictionary word. Return all such possible sentences in any order.
Note that the same word in the dictionary may be reused multiple times in the segmentation.
For example:
Example 1:
Input: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]
Output: ["cats and dog","cat sand dog"]
Example 2:
Input: s = "pineapplepenapple", wordDict = ["apple","pen","applepen","pine","pineapple"]
Output: ["pine apple pen apple","pineapple pen apple","pine applepen apple"]
Explanation: Note that you are allowed to reuse a dictionary word.
Example 3:
Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: []
Write a function to solve this problem efficiently. Consider the time and space complexity of your solution. Also, discuss edge cases like an empty input string or a dictionary that doesn't allow for any valid sentence.
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach attempts to find all possible ways to break a large piece of text into a sequence of valid words. We explore every single potential combination of word breaks. The main idea is to exhaustively try every possible split and check if the resulting sequence is made up of valid words from the given dictionary.
Here's how the algorithm would work step-by-step:
def word_break_brute_force(text, word_dictionary):
result_sentences = []
def backtrack(current_sentence, remaining_text):
# If no text remaining, the sentence is valid
if not remaining_text:
result_sentences.append(" ".join(current_sentence))
return
for index in range(1, len(remaining_text) + 1):
potential_word = remaining_text[:index]
# Check if the segment is a valid word
if potential_word in word_dictionary:
# Add to current sentence and recurse
backtrack(current_sentence + [potential_word], remaining_text[index:])
backtrack([], text)
return result_sentences
The core idea is to break down the problem into smaller, overlapping subproblems. We use a memory technique to avoid recalculating results we've already found, making the solution much faster. This allows us to explore the possibilities efficiently, remembering previous good choices.
Here's how the algorithm would work step-by-step:
def word_break(input_string, word_dictionary): def is_breakable(string_to_check, word_set): breakable = [False] * (len(string_to_check) + 1)
breakable[0] = True
for i in range(1, len(string_to_check) + 1):
for j in range(i):
if breakable[j] and string_to_check[j:i] in word_set:
breakable[i] = True
break
return breakable[-1]
if not is_breakable(input_string, set(word_dictionary)): return [] memo = {}
def word_break_helper(remaining_string, word_set): if not remaining_string:
return [""]
if remaining_string in memo:
return memo[remaining_string]
sentences = []
for word in word_set:
if remaining_string.startswith(word):
suffix = remaining_string[len(word):]
suffix_sentences = word_break_helper(suffix, word_set)
for sentence in suffix_sentences:
sentences.append(word + (" " + sentence if sentence else ""))
memo[remaining_string] = sentences
return sentences # Use a helper function with memoization. result = word_break_helper(input_string, set(word_dictionary))
# Return the sentences constructed. return result
Case | How to Handle |
---|---|
Empty input string s | Return an empty list if the input string is empty, as there are no segmentations possible. |
Null input string s or null wordDict | Throw an IllegalArgumentException or return an empty list to handle null inputs gracefully. |
Empty wordDict | Return an empty list if the dictionary is empty, as no words can be formed. |
No valid segmentation possible | Return an empty list when the input string cannot be segmented using the given dictionary. |
WordDict contains an empty string | Handle carefully to avoid infinite recursion or incorrect segmentations; typically it is excluded from being added into the data structure. |
Very long input string s (scalability) | Memoization/dynamic programming is essential for performance with long strings to avoid exponential time complexity. |
WordDict contains very long words | Consider the memory and time cost of checking very long words against substrings; prefiltering based on length may improve performance. |
Input string s is a very long repetition of a single character | Ensure the solution handles this gracefully, avoiding stack overflow with deep recursion by using memoization or dynamic programming. |