A Trie (also known as a prefix tree) is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
Implement the Trie class:
Trie() Initializes the trie object.void insert(String word) Inserts the string word into the trie.int countWordsEqualTo(String word) Returns the number of instances of the string word in the trie.int countWordsStartingWith(String prefix) Returns the number of strings in the trie that have the string prefix as a prefix.void erase(String word) Erases the string word from the trie.Example 1:
Input
["Trie", "insert", "insert", "countWordsEqualTo", "countWordsStartingWith", "erase", "countWordsEqualTo", "countWordsStartingWith", "erase", "countWordsStartingWith"]
[[], ["apple"], ["apple"], ["apple"], ["app"], ["apple"], ["apple"], ["app"], ["apple"], ["app"]]
Output
[null, null, null, 2, 2, null, 1, 1, null, 0]
Explanation
Trie trie = new Trie();
trie.insert("apple"); // Inserts "apple".
trie.insert("apple"); // Inserts another "apple".
trie.countWordsEqualTo("apple"); // Returns 2, because there are two "apple" strings in the trie.
trie.countWordsStartingWith("app"); // Returns 2, because there are two strings which start with "app".
trie.erase("apple"); // Erases one "apple".
trie.countWordsEqualTo("apple"); // Returns 1, because there is still one "apple" in the trie.
trie.countWordsStartingWith("app"); // Returns 1, because there is still one string which starts with "app".
trie.erase("apple"); // Erases "apple". Now the trie is empty.
trie.countWordsStartingWith("app"); // Returns 0, because there are no strings which start with "app".
Constraints:
1 <= word.length, prefix.length <= 2000word and prefix consist only of lowercase English letters.3 * 104 calls in total will be made to insert, countWordsEqualTo, countWordsStartingWith, and erase.word will exist in the trie when it is called.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The goal is to build a system to store words so we can quickly find out how many words start with a particular sequence of letters, and how many times a specific word was inserted. A brute force approach directly uses the words themselves for these operations, without any special structure.
Here's how the algorithm would work step-by-step:
class Trie:
def __init__(self):
self.word_list = []
def insert(self, word: str) -> None:
# Simply append the word to our word list.
self.word_list.append(word)
def countWordsEqualTo(self, word: str) -> int:
word_count = 0
# Iterate to count exact matches.
for existing_word in self.word_list:
if existing_word == word:
word_count += 1
return word_count
def countWordsStartingWith(self, prefix: str) -> int:
prefix_count = 0
# We need to check all stored words.
for existing_word in self.word_list:
# Check if word starts with prefix.
if existing_word.startswith(prefix):
prefix_count += 1
return prefix_count
def erase(self, word: str) -> None:
#Remove the first occurrence of the word.
if word in self.word_list:
self.word_list.remove(word)We'll use a special tree-like structure to store words. This structure helps us quickly find words with the same starting parts and also keep track of how many times a word was added or if it's a real word.
Here's how the algorithm would work step-by-step:
class TrieNode:
def __init__(self):
self.children = {}
self.word_count = 0
self.prefix_count = 0
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word: str) -> None:
current_node = self.root
for char in word:
if char not in current_node.children:
current_node.children[char] = TrieNode()
current_node = current_node.children[char]
current_node.prefix_count += 1
# Mark the end of the word and increase its count
current_node.word_count += 1
def countWordsEqualTo(self, word: str) -> int:
current_node = self.root
for char in word:
if char not in current_node.children:
return 0
current_node = current_node.children[char]
# Return the number of times the word was inserted
return current_node.word_count
def countWordsStartingWith(self, prefix: str) -> int:
current_node = self.root
for char in prefix:
if char not in current_node.children:
return 0
current_node = current_node.children[char]
# Return the number of words starting with the prefix
return current_node.prefix_count
def erase(self, word: str) -> None:
current_node = self.root
for char in word:
if char not in current_node.children:
return
current_node = current_node.children[char]
current_node.prefix_count -= 1
# Only decrement if word exists
if current_node.word_count > 0:
current_node.word_count -= 1| Case | How to Handle |
|---|---|
| Null or empty word inserted. | Treat as a valid insertion (empty word has its own presence) or raise IllegalArgumentException based on requirements. |
| Inserting the same word multiple times. | Increment the count of the word at the terminal node to reflect its multiple occurrences. |
| Searching for an empty prefix. | Return the total number of words in the trie (count of root node occurrences) which will effectively give words starting with empty prefix. |
| Erasing a word that doesn't exist. | Do nothing or raise an exception; ensure no underflow of counts at Trie nodes, and count does not dip below 0. |
| Erasing a word with count > 1. | Decrement the count of the word at the terminal node; also update the count in the path (prefix words). |
| Very long words are inserted, impacting memory usage. | Consider limiting the maximum word length or switching to a more memory-efficient Trie implementation if memory becomes a bottleneck (e.g., compressed trie). |
| Inserting words with non-alphabetic characters (e.g., numbers, symbols). | Define a character set (alphabet) and handle characters outside that set, possibly raising an error or mapping them to a default character. |
| Large number of insert/delete operations potentially leading to integer overflow of word counts. | Use a larger integer type (e.g., long) to store word counts to prevent overflow, and consider handling potential maximum count limits. |