Taro Logo

Longest Common Suffix Queries

Hard
Asked by:
Profile picture
Profile picture
14 views
Topics:
ArraysStringsTwo Pointers

You are given two arrays of strings wordsContainer and wordsQuery.

For each wordsQuery[i], you need to find a string from wordsContainer that has the longest common suffix with wordsQuery[i]. If there are two or more strings in wordsContainer that share the longest common suffix, find the string that is the smallest in length. If there are two or more such strings that have the same smallest length, find the one that occurred earlier in wordsContainer.

Return an array of integers ans, where ans[i] is the index of the string in wordsContainer that has the longest common suffix with wordsQuery[i].

Example 1:

Input: wordsContainer = ["abcd","bcd","xbcd"], wordsQuery = ["cd","bcd","xyz"]

Output: [1,1,1]

Explanation:

Let's look at each wordsQuery[i] separately:

  • For wordsQuery[0] = "cd", strings from wordsContainer that share the longest common suffix "cd" are at indices 0, 1, and 2. Among these, the answer is the string at index 1 because it has the shortest length of 3.
  • For wordsQuery[1] = "bcd", strings from wordsContainer that share the longest common suffix "bcd" are at indices 0, 1, and 2. Among these, the answer is the string at index 1 because it has the shortest length of 3.
  • For wordsQuery[2] = "xyz", there is no string from wordsContainer that shares a common suffix. Hence the longest common suffix is "", that is shared with strings at index 0, 1, and 2. Among these, the answer is the string at index 1 because it has the shortest length of 3.

Example 2:

Input: wordsContainer = ["abcdefgh","poiuygh","ghghgh"], wordsQuery = ["gh","acbfgh","acbfegh"]

Output: [2,0,2]

Explanation:

Let's look at each wordsQuery[i] separately:

  • For wordsQuery[0] = "gh", strings from wordsContainer that share the longest common suffix "gh" are at indices 0, 1, and 2. Among these, the answer is the string at index 2 because it has the shortest length of 6.
  • For wordsQuery[1] = "acbfgh", only the string at index 0 shares the longest common suffix "fgh". Hence it is the answer, even though the string at index 2 is shorter.
  • For wordsQuery[2] = "acbfegh", strings from wordsContainer that share the longest common suffix "gh" are at indices 0, 1, and 2. Among these, the answer is the string at index 2 because it has the shortest length of 6.

Constraints:

  • 1 <= wordsContainer.length, wordsQuery.length <= 104
  • 1 <= wordsContainer[i].length <= 5 * 103
  • 1 <= wordsQuery[i].length <= 5 * 103
  • wordsContainer[i] consists only of lowercase English letters.
  • wordsQuery[i] consists only of lowercase English letters.
  • Sum of wordsContainer[i].length is at most 5 * 105.
  • Sum of wordsQuery[i].length is at most 5 * 105.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the strings within the input array, and what is the maximum number of strings in the array?
  2. Can the input strings contain characters other than lowercase English letters, or can I assume only 'a' to 'z'?
  3. If there is no common suffix among the input strings, what should the function return? Should it return an empty string or null?
  4. In the case of multiple queries, are the strings used in previous queries preserved or are the strings different each time?
  5. Can I assume that the query indices are always within the bounds of the provided array of strings?

Brute Force Solution

Approach

The brute force approach to finding the longest common suffix between multiple words involves checking all possible suffix lengths. We start from the shortest possible length and gradually increase it, checking for a match across all words at each step. We stop once we've exhausted all possibilities.

Here's how the algorithm would work step-by-step:

  1. Consider the shortest word among the words given. The longest common suffix cannot be longer than this.
  2. Start by checking if the very last letter of all words is the same. This would be a suffix of length 1.
  3. Then, check if the last two letters of all words are the same. This would be a suffix of length 2.
  4. Continue this process, increasing the length of the suffix you are comparing by one letter at a time.
  5. At each step, if the suffix is the same across all the words, store the suffix.
  6. Once you reach the maximum possible suffix length (the length of the shortest word), stop checking.
  7. If you found any common suffixes, return the longest one you saved. If not, then there is no common suffix.

Code Implementation

def longest_common_suffix_brute_force(words):
    shortest_word_length = min(len(word) for word in words)
    longest_common_suffix = ""

    # The suffix cannot be longer than the shortest word.
    for suffix_length in range(1, shortest_word_length + 1):
        suffix = words[0][-suffix_length:]

        # Compare the current suffix against all words.
        is_common = True
        for word in words:
            if not word.endswith(suffix):
                is_common = False
                break

        if is_common:
            # Update the longest common suffix.
            longest_common_suffix = suffix

    # Return the longest common suffix or an empty string if none exists.
    return longest_common_suffix

Big(O) Analysis

Time Complexity
O(N*M)Let N be the number of words and M be the length of the shortest word. The outer loop iterates up to the length of the shortest word, which is M. In each iteration of the outer loop, we iterate through all the words, which takes N steps. Thus, the overall time complexity is O(N*M) since we are checking N words up to M times.
Space Complexity
O(S)The space complexity is determined by the storage of the longest common suffix found so far. In the described approach, a variable is used to store the longest common suffix discovered as the algorithm iterates through progressively longer suffixes. The length of this string variable cannot exceed the length of the shortest word among the inputs, which we can denote as S. Therefore, the auxiliary space is O(S).

Optimal Solution

Approach

The key to efficiently solving this problem is to avoid redundant comparisons. We'll construct a special data structure to quickly find the longest shared ending between any two words, dramatically speeding up the query process.

Here's how the algorithm would work step-by-step:

  1. Create a structure that holds the words and helps you quickly find their shared endings. Think of it as organizing the words by their shared last letters.
  2. When a query asks for the longest shared ending of two words, use this structure to efficiently jump to possible shared endings. Start checking for matches at the end of the words.
  3. Instead of comparing the whole word every time, the structure guides you to only compare the parts that matter, starting from the ends.
  4. Once you find the longest common ending, you're done! No need to keep checking the rest of the words.

Code Implementation

def longest_common_suffix(strings, queries):
    results = []
    for index_one, index_two in queries:
        string_one = strings[index_one]
        string_two = strings[index_two]
        
        # Determine minimum length to avoid index errors
        minimum_length = min(len(string_one), len(string_two))
        common_suffix_length = 0
        
        # Iterate backwards to find the longest common suffix
        for i in range(1, minimum_length + 1):
            if string_one[-i] == string_two[-i]:
                common_suffix_length += 1
            else:
                break
        results.append(common_suffix_length)

    return results

Big(O) Analysis

Time Complexity
O(N + Q * M)Building the data structure (presumably a suffix tree or trie-like structure) from the input words takes O(N) time, where N is the total length of all words. Each query involves traversing the data structure to find the longest common suffix, and in the worst case, this traversal could take O(M) time, where M is the length of the shortest of the two queried words. With Q queries, the overall query time becomes O(Q * M). Therefore, the total time complexity is O(N + Q * M).
Space Complexity
O(N)The structure described for holding the words and efficiently finding shared endings, conceptually an organization of words by their shared last letters, could potentially store all characters of all input words. This suggests a data structure, like a trie or a set of hash maps, that could require space proportional to the total number of characters across all input words. Therefore, if N represents the total number of characters in all input strings, the auxiliary space would be O(N).

Edge Cases

Empty list of strings
How to Handle:
Return an empty list of results indicating no common suffixes.
Queries list is empty
How to Handle:
Return an empty list since there are no suffix queries to perform.
One or more strings in the string list are empty
How to Handle:
An empty string has an empty common suffix with other strings; handle gracefully in comparison.
One or more strings are null
How to Handle:
Treat null strings as empty strings or throw an IllegalArgumentException if nulls are not permitted.
Queries with the same string indices
How to Handle:
Ensure each query is processed correctly even if it asks for the same pair of strings multiple times.
Long strings causing excessive memory usage
How to Handle:
Consider limiting the maximum length of strings allowed and allocate memory efficiently.
Large number of strings causing excessive memory usage or time complexity
How to Handle:
Optimize string comparisons to minimize time complexity and consider using a trie-like structure.
String indices in queries are out of bounds
How to Handle:
Check query indices against the size of the input string list and return an error or a specific value like -1.