Given a string s
and a dictionary of strings wordDict
, return true
if s
can be segmented into a space-separated sequence of one or more dictionary words. Note that the same word in the dictionary may be reused multiple times in the segmentation. Consider the following examples:
s = "leetcode", wordDict = ["leet","code"]
. The expected output is true
because "leetcode" can be segmented as "leet code".s = "applepenapple", wordDict = ["apple","pen"]
. The expected output is true
because "applepenapple" can be segmented as "apple pen apple". Note that you are allowed to reuse a dictionary word.s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
. The expected output is false
.Provide an efficient algorithm to solve this problem. Detail the runtime and space complexity of your approach.
## Word Break Problem
This problem asks whether a given string `s` can be segmented into a space-separated sequence of one or more words from a given dictionary `wordDict`. The same word in the dictionary may be reused multiple times.
### 1. Brute Force Solution (with Recursion)
We can solve this problem recursively. For each prefix of the string `s`, we check if it's in the dictionary. If it is, we recursively check if the remaining suffix can also be segmented. This approach explores all possible segmentations.
```python
def word_break_recursive(s: str, word_dict: set[str]) -> bool:
if not s:
return True
for i in range(1, len(s) + 1):
prefix = s[:i]
if prefix in word_dict and word_break_recursive(s[i:], word_dict):
return True
return False
def word_break_brute_force(s: str, wordDict: list[str]) -> bool:
return word_break_recursive(s, set(wordDict))
# Example Usage
s = "leetcode"
wordDict = ["leet", "code"]
print(f"Brute Force: {word_break_brute_force(s, wordDict)}") # Output: True
s = "applepenapple"
wordDict = ["apple", "pen"]
print(f"Brute Force: {word_break_brute_force(s, wordDict)}") # Output: True
s = "catsandog"
wordDict = ["cats", "dog", "sand", "and", "cat"]
print(f"Brute Force: {word_break_brute_force(s, wordDict)}") # Output: False
This recursive approach has exponential time complexity due to overlapping subproblems. For example, if s = 'aaaaab'
and wordDict = ['a', 'aa', 'aaa']
, the function will recalculate the possibility of breaking down 'a', 'aa', 'aaa', 'aaaa', 'aaaaa' repeatedly.
To avoid redundant calculations, we can use dynamic programming. We create a boolean array dp
where dp[i]
represents whether the substring s[:i]
can be segmented into words from the dictionary.
dp[0]
to True
(an empty string can be segmented).s
from index 1 to len(s)
. For each index i
, iterate through all possible starting indices j
(from 0 to i-1
).dp[j]
is True
(meaning the substring s[:j]
can be segmented) and the substring s[j:i]
is in the dictionary, then dp[i]
is also True
.def word_break_dp(s: str, wordDict: list[str]) -> bool:
word_set = set(wordDict)
dp = [False] * (len(s) + 1)
dp[0] = True
for i in range(1, len(s) + 1):
for j in range(i):
if dp[j] and s[j:i] in word_set:
dp[i] = True
break
return dp[len(s)]
# Example Usage
s = "leetcode"
wordDict = ["leet", "code"]
print(f"Dynamic Programming: {word_break_dp(s, wordDict)}") # Output: True
s = "applepenapple"
wordDict = ["apple", "pen"]
print(f"Dynamic Programming: {word_break_dp(s, wordDict)}") # Output: True
s = "catsandog"
wordDict = ["cats", "dog", "sand", "and", "cat"]
print(f"Dynamic Programming: {word_break_dp(s, wordDict)}") # Output: False
s
. In the worst case, each character could potentially be the start of a new word, leading to an exponential number of possible segmentations.s
. The outer loop iterates n
times, and the inner loop iterates up to n
times, resulting in O(n2) complexity. Checking if a substring exists in the dictionary word_set
takes O(1) on average (if word_set
is a hash set).s
. The dp
array stores boolean values for each prefix of s
, requiring linear space.s
is empty, the function should return True
(because an empty string can be considered segmented).wordDict
is empty, the function should return False
(unless the input string s
is also empty).s
is very long and the words in wordDict
are very short, the dynamic programming approach becomes much more efficient than the brute force approach.s
cannot be segmented into words from wordDict
, the function should return False
(e.g., s = "aaaaab", wordDict = ["a", "aa", "aaa"]
).wordDict
contains an empty string, the dynamic programming approach can lead to an infinite loop or incorrect results. It's best to handle this case by ensuring the dictionary does not contain an empty string.