Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'
). For each character they type except '#'
, you need to return the top 3 historical hot sentences that match the prefix of the part sentence before the current character. For each input sentence, when the character is '#'
, it means the sentence ends, and you need to store the complete sentence into the historical data.
Implement the AutocompleteSystem
class:
AutocompleteSystem(String[] sentences, int[] times)
Initializes the object with a list of historical sentences
and their corresponding number of times
each sentence was typed.List<String> input(char c)
c == '#'
, it means the sentence ends, and you need to store the complete sentence into the historical data.
Example 1:
Input ["AutocompleteSystem", "input", "input", "input", "input"] [[["i love you", "island", "iroman", "i love leetcode"], [5, 3, 2, 2]], ["i"], [" "], ["a"], ["#"]] Output [null, ["i love you", "island", "i love leetcode"], [], [], []] Explanation AutocompleteSystem obj = new AutocompleteSystem(["i love you", "island", "iroman", "i love leetcode"], [5, 3, 2, 2]); obj.input("i"); // return ["i love you", "island", "i love leetcode"], because "i love you" has highest times. obj.input(" "); // return []. There's no match. obj.input("a"); // return []. There's no match. obj.input("#"); // return []. The sentence "i a" was typed. Store the string to the record.
Example 2:
Input ["AutocompleteSystem", "input", "input", "input", "input"] [[["i love you", "island", "iroman", "i love leetcode"], [5, 3, 2, 2]], ["i"], [" "], ["a"], ["#"]] Output [null, ["i love you", "island", "i love leetcode"], [], [], []] Explanation AutocompleteSystem obj = new AutocompleteSystem(["i love you", "island", "iroman", "i love leetcode"], [5, 3, 2, 2]); obj.input("i"); // return ["i love you", "island", "i love leetcode"], because "i love you" has highest times. obj.input(" "); // return []. There's no match. obj.input("a"); // return []. There's no match. obj.input("#"); // return []. The sentence "i a" was typed. Store the string to the record.
Constraints:
1 <= sentences.length <= 100
1 <= sentences[i].length <= 100
1 <= times.length <= 100
times[i] >= 1
c
is a lowercase English letter, space ' '
or special character '#'
'#'
.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach to search autocomplete is like trying every possible search query starting with the given input. We check each of these queries against all available search terms to see which ones match. This ensures we find all relevant suggestions, even if it's slow.
Here's how the algorithm would work step-by-step:
def search_autocomplete_brute_force(search_terms, user_input):
suggested_terms = []
# Iterate through all possible extensions of the user's input
for search_term in search_terms:
if search_term.startswith(user_input):
# Add the search term to suggested terms if it matches.
suggested_terms.append(search_term)
return suggested_terms
def search_autocomplete_with_ranking(
search_terms,
user_input,
popularity_scores
):
suggested_terms = search_autocomplete_brute_force(
search_terms,
user_input
)
# Filter for terms where popularity scores are known
available_suggestions = [
term for term in suggested_terms if term in popularity_scores
]
# Sort by popularity.
ranked_suggestions = sorted(
available_suggestions,
key=lambda term: popularity_scores[term],
reverse=True
)
# Only show the top 5 suggestions.
return ranked_suggestions[:5]
To design an efficient search autocomplete system, we'll build a special data structure that organizes all possible search terms. This structure will let us quickly find and rank relevant suggestions as the user types, without needing to search through every single term each time.
Here's how the algorithm would work step-by-step:
class TrieNode:
def __init__(self):
self.children = {}
self.suggestions = []
class AutocompleteSystem:
def __init__(self, sentences, times):
self.root = TrieNode()
self.current_prefix = ""
for i in range(len(sentences)):
self.add_sentence(sentences[i], times[i])
def add_sentence(self, sentence, hot):
node = self.root
for char in sentence:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.suggestions.append((sentence, hot))
node.suggestions.sort(key=lambda item: (-item[1], item[0]))
if len(node.suggestions) > 3:
node.suggestions.pop()
def get_suggestions(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return []
node = node.children[char]
return [item[0] for item in node.suggestions]
def input(self, character):
if character == '#':
self.add_sentence(self.current_prefix, 1)
self.current_prefix = ""
return []
self.current_prefix += character
return self.get_suggestions(self.current_prefix)
# The Trie data structure is initialized.
# Each node represents a character in a potential search term.
def example_usage():
sentences = ["i love you", "island", "iroman", "i love leetcode"]
times = [5, 3, 2, 2]
auto_complete_system = AutocompleteSystem(sentences, times)
print(auto_complete_system.input('i'))
print(auto_complete_system.input(' '))
print(auto_complete_system.input('l'))
print(auto_complete_system.input('o'))
print(auto_complete_system.input('v'))
print(auto_complete_system.input('e'))
print(auto_complete_system.input(' '))
print(auto_complete_system.input('y'))
print(auto_complete_system.input('o'))
print(auto_complete_system.input('u'))
print(auto_complete_system.input('#'))
print(auto_complete_system.input('i'))
# Adding a new sentence or incrementing the 'hot' score
# updates the Trie's suggestion lists based on popularity.
# Every typed character updates current_prefix, which then
# queries the trie and returns top suggestions.
# This function demonstrates the AutocompleteSystem in action
# It initializes the system, processes input, and prints the output.
if __name__ == '__main__':
example_usage()
Case | How to Handle |
---|---|
sentences is null or empty | Initialize the AutocompleteSystem with an empty trie and an empty frequency map. |
times is null or empty or has different length from sentences | Throw an IllegalArgumentException or return an error code indicating invalid input. |
Input character 'c' is non-ASCII or contains special control characters | Handle only ASCII characters or sanitize the input to prevent unexpected behavior. |
sentences contains empty strings | Treat empty strings as valid sentences, but their frequency should be considered accordingly when suggesting top sentences. |
times contains negative frequencies | Treat negative frequencies as zero, or throw an IllegalArgumentException, ensuring frequency is non-negative. |
Prefix search returns more than 3 matches. | Sort the matching sentences based on frequency and lexicographical order, and return only the top 3. |
All sentences start with the same prefix | Ensure that sorting is still done correctly with all sentences matching, returning top 3 based on frequency and lexicographical order. |
The same sentence is added multiple times with # | Update the sentence's frequency correctly each time it's entered with #, adding to the total frequency. |