Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
"ace" is a subsequence of "abcde".Example 1:
Input: s = "abcde", words = ["a","bb","acd","ace"] Output: 3 Explanation: There are three strings in words that are a subsequence of s: "a", "acd", "ace".
Example 2:
Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"] Output: 2
Constraints:
1 <= s.length <= 5 * 1041 <= words.length <= 50001 <= words[i].length <= 50s and words[i] consist of only lowercase English letters.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force method to find matching subsequences involves checking each word in a list to see if it can be formed by picking characters from another bigger word in the correct order. We'll try every possible way to match the characters.
Here's how the algorithm would work step-by-step:
def number_of_matching_subsequences_brute_force(big_word, list_of_smaller_words):
matching_subsequence_count = 0
for small_word in list_of_smaller_words:
small_word_index = 0
big_word_index = 0
while small_word_index < len(small_word) and big_word_index < len(big_word):
# We are checking if the characters match.
if small_word[small_word_index] == big_word[big_word_index]:
small_word_index += 1
big_word_index += 1
# Increment count if small_word is subsequence.
if small_word_index == len(small_word):
matching_subsequence_count += 1
return matching_subsequence_countInstead of checking if each word in the word list is a subsequence of the big string individually, we process the big string only once. We keep track of where we are in each word of the word list as we go through the big string. This way we only process each character of the big string once.
Here's how the algorithm would work step-by-step:
def number_of_matching_subsequences(big_string, words):
word_pointers = [0] * len(words)
matching_subsequence_count = 0
# Iterate through the big string
for big_string_character in big_string:
for word_index, word in enumerate(words):
# Check if the current character matches the character pointed to by the pointer
if word_pointers[word_index] < len(word) and big_string_character == word[word_pointers[word_index]]:
# Advance the pointer if there is a match
word_pointers[word_index] += 1
# Check if the word has been fully matched
if word_pointers[word_index] == len(word):
matching_subsequence_count += 1
# Reset the pointer so the word is only counted once
word_pointers[word_index] = float('inf')
return matching_subsequence_count| Case | How to Handle |
|---|---|
| Empty string S or empty words array | If S is empty, return 0. If words is empty, return 0. |
| String S is very long | Pre-compute indices for each character of S to improve efficiency for repeated subsequence checks. |
| A word in words is longer than S | A word longer than S cannot be a subsequence, so skip it. |
| The words array contains duplicate words | Count each unique word and its frequency to avoid redundant subsequence checks, incrementing result by frequency. |
| All words are subsequences of S | The solution should correctly count all words as subsequences if they are indeed subsequences. |
| No words are subsequences of S | The solution should return 0 if no words are subsequences of S. |
| Word contains characters not in S | The solution should identify that the word cannot be a subsequence and skip counting it. |
| Extremely long words array with many distinct words | Efficient data structures are needed to quickly check and count subsequences to avoid exceeding time limit. |