You are given a string s
and an array of strings words
. All the strings of words
are of the same length. A concatenated string is a string that exactly contains all the strings of any permutation of words
concatenated.
words = ["ab","cd","ef"]
, then "abcdef"
, "abefcd"
, "cdabef"
, "cdefab"
, "efabcd"
, and "efcdab"
are all concatenated strings. "acdbef"
is not a concatenated string because it is not the concatenation of any permutation of words
.Return an array of the starting indices of all the concatenated substrings in s
. You can return the answer in any order.
Example 1:
Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]
Explanation:
The substring starting at 0 is "barfoo"
. It is the concatenation of ["bar","foo"]
which is a permutation of words
.
The substring starting at 9 is "foobar"
. It is the concatenation of ["foo","bar"]
which is a permutation of words
.
Example 2:
Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []
Explanation: There is no concatenated substring.
Example 3:
Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]
Explanation:
The substring starting at 6 is "foobarthe"
. It is the concatenation of ["foo","bar","the"]
.
The substring starting at 9 is "barthefoo"
. It is the concatenation of ["bar","the","foo"]
.
The substring starting at 12 is "thefoobar"
. It is the concatenation of ["the","foo","bar"]
.
from collections import defaultdict
def find_substring(s, words):
if not s or not words:
return []
word_len = len(words[0])
num_words = len(words)
total_len = word_len * num_words
result = []
# Create a dictionary to store the frequency of each word in the 'words' array
word_count = defaultdict(int)
for word in words:
word_count[word] += 1
# Iterate through the string 's'
for i in range(len(s) - total_len + 1):
seen_words = defaultdict(int)
words_found = 0
# Check if the substring is a concatenation of words in the 'words' array
for j in range(num_words):
start = i + j * word_len
word = s[start:start + word_len]
if word in word_count:
seen_words[word] += 1
if seen_words[word] <= word_count[word]:
words_found += 1
else:
break # Current word exceeds its count in 'words'
else:
break # Current word is not in 'words'
# If all words were found in the correct frequency, add the starting index to the result
if words_found == num_words:
result.append(i)
return result
# Example Usage:
s = "barfoothefoobarman"
words = ["foo","bar"]
print(find_substring(s, words)) # Output: [0, 9]
s = "wordgoodgoodgoodbestword"
words = ["word","good","best","word"]
print(find_substring(s, words)) # Output: []
s = "barfoofoobarthefoobarman"
words = ["bar","foo","the"]
print(find_substring(s, words)) # Output: [6, 9, 12]
word_len
: Length of each word in the words
array.num_words
: Number of words in the words
array.total_len
: Total length of the concatenated string.result
: List to store the starting indices of the concatenated substrings.word_count
: Dictionary to store the frequency of each word in the words
array.words
array and store the frequency of each word in the word_count
dictionary.s
from index 0
to len(s) - total_len + 1
.i
, check if the substring starting at i
is a concatenation of words in the words
array.seen_words
: Dictionary to store the frequency of each word encountered in the current substring.words_found
: Counter to track the number of words found in the correct frequency.num_words
words in the current substring.word_len
from the substring.word_count
dictionary, increment its count in the seen_words
dictionary.seen_words
dictionary is less than or equal to its count in the word_count
dictionary, increment the words_found
counter.seen_words
dictionary exceeds its count in the word_count
dictionary, break the inner loop.word_count
dictionary, break the inner loop.words_found == num_words
), add the starting index i
to the result
list.result
list.The time complexity of the find_substring
function can be analyzed as follows:
s
from index 0
to len(s) - total_len + 1
. In the worst case, this loop runs O(n)
times, where n
is the length of the string s
.num_words
words in the current substring. This loop runs O(m)
times, where m
is the number of words in the words
array.word_count
dictionary takes O(m)
time, where m
is the number of words in the words
array.word in word_count
, seen_words[word] += 1
) take O(1)
time on average.Therefore, the overall time complexity of the find_substring
function is O(n * m)
, where n
is the length of the string s
and m
is the number of words in the words
array.
The space complexity of the find_substring
function can be analyzed as follows:
word_count
Dictionary: The word_count
dictionary stores the frequency of each word in the words
array. In the worst case, the dictionary contains all the words in the words
array. Therefore, the space complexity of the word_count
dictionary is O(m)
, where m
is the number of words in the words
array.seen_words
Dictionary: The seen_words
dictionary stores the frequency of each word encountered in the current substring. In the worst case, the dictionary contains all the words in the words
array. Therefore, the space complexity of the seen_words
dictionary is O(m)
, where m
is the number of words in the words
array.result
List: The result
list stores the starting indices of the concatenated substrings. In the worst case, all the substrings in s
are concatenated strings. Therefore, the space complexity of the result
list is O(n)
, where n
is the length of the string s
.Therefore, the overall space complexity of the find_substring
function is O(m + n)
, where m
is the number of words in the words
array and n
is the length of the string s
.