Given a wordlist
, we want to implement a spellchecker that converts a query word into a correct word.
For a given query
word, the spell checker handles two categories of spelling mistakes:
wordlist = ["yellow"]
, query = "YellOw"
: correct = "yellow"
wordlist = ["Yellow"]
, query = "yellow"
: correct = "Yellow"
wordlist = ["yellow"]
, query = "yellow"
: correct = "yellow"
('a', 'e', 'i', 'o', 'u')
of the query word with any vowel individually, it matches a word in the wordlist (case-insensitive), then the query word is returned with the same case as the match in the wordlist.
wordlist = ["YellOw"]
, query = "yollow"
: correct = "YellOw"
wordlist = ["YellOw"]
, query = "yeellow"
: correct = ""
(no match)wordlist = ["YellOw"]
, query = "yllw"
: correct = ""
(no match)In addition, the spell checker operates under the following precedence rules:
Given some queries
, return a list of words answer
, where answer[i]
is the correct word for query = queries[i]
.
Example 1:
Input: wordlist = ["KiTe","kite","hare","Hare"], queries = ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"] Output: ["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]
Example 2:
Input: wordlist = ["yellow"], queries = ["YellOw"] Output: ["yellow"]
Constraints:
1 <= wordlist.length, queries.length <= 5000
1 <= wordlist[i].length, queries[i].length <= 7
wordlist[i]
and queries[i]
consist only of only English letters.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
To find the correct spelling for each query word, we methodically check it against every single word in our dictionary. We do this multiple times, applying one spelling rule at a time in order of priority, and we stop as soon as we find any valid match.
Here's how the algorithm would work step-by-step:
def vowel_spellchecker_brute_force(wordlist, queries):
corrected_queries_list = []
vowels = {'a', 'e', 'i', 'o', 'u'}
for query_word in queries:
match_found_for_query = False
# We check for a perfect, case-sensitive match first because it's the highest priority rule.
for dictionary_word in wordlist:
if dictionary_word == query_word:
corrected_queries_list.append(dictionary_word)
match_found_for_query = True
break
if match_found_for_query:
continue
# If no exact match exists, we try a case-insensitive match as the next priority.
for dictionary_word in wordlist:
if dictionary_word.lower() == query_word.lower():
corrected_queries_list.append(dictionary_word)
match_found_for_query = True
break
if match_found_for_query:
continue
# Finally, we handle vowel errors by comparing a 'devoweled' version of the words.
query_chars_devoweled = []
for character in query_word.lower():
if character in vowels:
query_chars_devoweled.append('*')
else:
query_chars_devoweled.append(character)
devoweled_query = "".join(query_chars_devoweled)
for dictionary_word in wordlist:
dict_word_chars_devoweled = []
for character in dictionary_word.lower():
if character in vowels:
dict_word_chars_devoweled.append('*')
else:
dict_word_chars_devoweled.append(character)
devoweled_dictionary_word = "".join(dict_word_chars_devoweled)
if devoweled_dictionary_word == devoweled_query:
corrected_queries_list.append(dictionary_word)
match_found_for_query = True
break
if match_found_for_query:
continue
corrected_queries_list.append("")
return corrected_queries_list
The best approach is to avoid repeatedly searching the entire word list for each query. Instead, we do a one-time setup by creating a few specialized lookup guides from the word list, which lets us find the correct spelling for any query almost instantly by checking these guides in a specific order.
Here's how the algorithm would work step-by-step:
def vowel_spellchecker(word_list, queries_list):
exact_match_words = set(word_list)
capitalization_map = {}
vowel_error_map = {}
# Pre-processing builds lookup maps to make query time near-instant, avoiding N*M complexity.
for original_word in word_list:
lowercase_version = original_word.lower()
# A 'devoweled' key generalizes words to handle any vowel mistake, like 'kite' and 'kote'.
devoweled_key = "".join('*' if char in 'aeiou' else char for char in lowercase_version)
if lowercase_version not in capitalization_map:
capitalization_map[lowercase_version] = original_word
if devoweled_key not in vowel_error_map:
vowel_error_map[devoweled_key] = original_word
result_list = []
# Queries are checked in a strict priority order: exact, case-insensitive, then vowel error.
for current_query in queries_list:
if current_query in exact_match_words:
result_list.append(current_query)
continue
lowercase_query = current_query.lower()
if lowercase_query in capitalization_map:
result_list.append(capitalization_map[lowercase_query])
continue
devoweled_query_key = "".join('*' if char in 'aeiou' else char for char in lowercase_query)
if devoweled_query_key in vowel_error_map:
result_list.append(vowel_error_map[devoweled_query_key])
continue
# If a query fails all matching rules, it has no valid correction in the word list.
result_list.append("")
return result_list
Case | How to Handle |
---|---|
The input `wordlist` or `queries` array is empty. | The solution should return an empty array, correctly handling the base case of no work to be done. |
The `wordlist` and `queries` arrays are very large, potentially causing a timeout. | Using hash maps to pre-process the `wordlist` allows for constant-time lookups for each query, ensuring an efficient O(N+M) solution. |
A query has multiple potential matches with different priorities (e.g., one exact, one case-insensitive). | The solution must perform lookups in the strict order of priority: exact match first, then case-insensitive, then vowel-error. |
A query has multiple case-insensitive or vowel-error matches in the `wordlist`. | When building lookup maps, the solution only stores the first occurrence for any given key, correctly implementing the 'first match' rule. |
A query does not match any word in the `wordlist` by any of the three rules. | If all lookup attempts fail, the solution correctly returns an empty string for that query as specified. |
The `wordlist` contains the same word with different casings, like 'Apple' and 'apple'. | Separate data structures for exact and case-insensitive matches ensure that an exact query ('apple') is correctly found over a case-insensitive one ('APPLE'). |
Words in the inputs contain non-alphabetic characters such as punctuation or numbers. | The vowel-replacement logic only transforms the five English vowels, treating all other characters as immutable consonants. |
The `wordlist` or `queries` contain empty strings. | An empty query correctly finds an exact match if an empty string exists in the wordlist, otherwise it correctly results in no match. |