Taro Logo

Prefix and Suffix Search

Hard
Meta logo
Meta
1 view
Topics:
ArraysStrings

Design a WordFilter class that supports searching words by prefix and suffix.

The class should have the following methods:

  1. WordFilter(string[] words): Initializes the object with a list of words. Each word words[i] consists of lowercase English letters only and has a length between 1 and 7 characters (inclusive). The number of words in the list will be between 1 and 10,000 (inclusive).
  2. f(string pref, string suff): Returns the index of the word in the dictionary that has the prefix pref and the suffix suff. Both pref and suff consist of lowercase English letters only and have a length between 1 and 7 characters (inclusive). If there is more than one valid index, return the largest one. If no such word exists in the dictionary, return -1. The method f will be called at most 10,000 times.

Example:

WordFilter wordFilter = new WordFilter(["apple", "banana", "applet"]);
wordFilter.f("app", "e"); // Returns 0 because "apple" has index 0, prefix "app", and suffix "e"
wordFilter.f("b", "a");   // Returns 1 because "banana" has index 1, prefix "b", and suffix "a"
wordFilter.f("apple", ""); // Returns 0 because "apple" has index 0, prefix "apple", and suffix "" (empty string)
wordFilter.f("a", "t");   // Returns 2 because "applet" has index 2, prefix "a", and suffix "t"
wordFilter.f("z", "z");   // Returns -1 because no word has both prefix "z" and suffix "z"

Considerations:

  • How can you efficiently search for words based on both prefix and suffix?
  • What data structures can be used to optimize the search?
  • How does your solution handle multiple matching words, ensuring the largest index is returned?
  • How can you handle edge cases, such as empty prefixes or suffixes, or when no matching word exists?

Solution


WordFilter Problem Solution

Problem Description

Design a special dictionary that searches the words in it by a prefix and a suffix.

Implement the WordFilter class:

  • WordFilter(string[] words) Initializes the object with the words in the dictionary.
  • f(string pref, string suff) Returns the index of the word in the dictionary, which has the prefix pref and the suffix suff. If there is more than one valid index, return the largest of them. If there is no such word in the dictionary, return -1.

Example:

Input
["WordFilter", "f"]
[[["apple"]], ["a", "e"]]
Output
[null, 0]
Explanation
WordFilter wordFilter = new WordFilter(["apple"]);
wordFilter.f("a", "e"); // return 0, because the word at index 0 has prefix = "a" and suffix = "e".

Brute Force Solution

A naive approach is to iterate through the entire word list for each query f(pref, suff). For each word, check if it starts with the given prefix and ends with the given suffix. If it does, store its index. After iterating through all words, return the largest index found, or -1 if no word matches.

Code (Java)

class WordFilter {
    String[] words;

    public WordFilter(String[] words) {
        this.words = words;
    }

    public int f(String pref, String suff) {
        int index = -1;
        for (int i = 0; i < words.length; i++) {
            if (words[i].startsWith(pref) && words[i].endsWith(suff)) {
                index = i;
            }
        }
        return index;
    }
}

Time Complexity

O(N * K), where N is the number of words in the input list and K is the average length of the words, pref, and suff. We iterate through all the words, and for each word, we perform startsWith and endsWith operations which take O(K) time.

Space Complexity

O(1). We only use a constant amount of extra space.

Optimal Solution (Using Trie)

To optimize, we can use a Trie data structure. The idea is to combine the prefix and suffix into a single string and store it in the Trie. We concatenate the suffix, a separator (e.g., '{'), and the prefix, and then store this combined string in the Trie along with the index of the word. For each query, we create the combined string and search for it in the Trie.

Code (Java)

class WordFilter {

    class TrieNode {
        TrieNode[] children;
        int index;

        TrieNode() {
            children = new TrieNode[27]; // 26 for letters + 1 for '{'
            index = -1;
        }
    }

    TrieNode trie;

    public WordFilter(String[] words) {
        trie = new TrieNode();
        for (int i = 0; i < words.length; i++) {
            String word = words[i];
            for (int j = 0; j <= word.length(); j++) {
                String combined = word.substring(j) + "{" + word;
                TrieNode node = trie;
                for (char c : combined.toCharArray()) {
                    int charIndex = c == '{' ? 26 : c - 'a';
                    if (node.children[charIndex] == null) {
                        node.children[charIndex] = new TrieNode();
                    }
                    node = node.children[charIndex];
                    node.index = i;
                }
            }
        }
    }

    public int f(String pref, String suff) {
        String combined = suff + "{" + pref;
        TrieNode node = trie;
        for (char c : combined.toCharArray()) {
            int charIndex = c == '{' ? 26 : c - 'a';
            if (node.children[charIndex] == null) {
                return -1;
            }
            node = node.children[charIndex];
        }
        return node.index;
    }
}

Time Complexity

  • Constructor: O(N * K^2), where N is the number of words and K is the maximum length of a word. For each word, we generate all possible combined strings and insert them into the Trie.
  • f(pref, suff): O(P + S), where P is the length of the prefix and S is the length of the suffix. We traverse the Trie based on the combined string of the prefix and suffix.

Space Complexity

O(N * K^2), where N is the number of words and K is the maximum length of a word. The Trie stores all possible combined strings, which can take up significant space.

Edge Cases

  • Empty words array.
  • Empty pref or suff strings.
  • pref or suff containing characters other than lowercase English letters.
  • Multiple words matching both the prefix and suffix - should return the largest index.
  • No word matching the prefix and suffix - should return -1.