Taro Logo

Longest Uncommon Subsequence II

Medium
Google logo
Google
1 view
Topics:
ArraysStringsTwo Pointers

Given an array of strings strs, return the length of the longest uncommon subsequence between them. If the longest uncommon subsequence does not exist, return -1.

An uncommon subsequence between an array of strings is a string that is a subsequence of one string but not the others.

A subsequence of a string s is a string that can be obtained after deleting any number of characters from s.

  • For example, "abc" is a subsequence of "aebdc" because you can delete the underlined characters in "a<u>e</u>b<u>d</u>c" to get "abc". Other subsequences of "aebdc" include "aebdc", "aeb", and "" (empty string).

Example 1:

Input: strs = ["aba","cdc","eae"]
Output: 3

Example 2:

Input: strs = ["aaa","aaa","aa"]
Output: -1

Constraints:

  • 2 <= strs.length <= 50
  • 1 <= strs[i].length <= 10
  • strs[i] consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of each string in the input array?
  2. Can the input array contain empty strings or null values?
  3. If there are multiple longest uncommon subsequences, can I return any one of them?
  4. If no uncommon subsequence exists, what value should I return?
  5. Are the input strings case-sensitive, or should I treat them as case-insensitive?

Brute Force Solution

Approach

The goal is to find the longest word that is not a subsequence of any other word in the given list. The brute force strategy involves checking every word against every other word, one by one, to see if it fits the criteria.

Here's how the algorithm would work step-by-step:

  1. Take the first word in the list.
  2. Compare it against every other word in the list to see if it's a subsequence of any of them.
  3. If the first word is a subsequence of any other word, then it's not a candidate for the longest uncommon subsequence. Discard it and move to the next word.
  4. If the first word is NOT a subsequence of any other word, keep it in mind as a potential candidate.
  5. Now, repeat this process for every word in the list.
  6. After checking every word, you will have a list of potential candidates (words that are not subsequences of any other words).
  7. From this list of candidates, find the longest one. This is the longest uncommon subsequence.

Code Implementation

def longest_uncommon_subsequence_brute_force(words):
    potential_candidates = []

    for current_word in words:
        is_subsequence = False
        for other_word in words:
            if current_word != other_word:

                # Check if current word is a subsequence of any other word
                if is_subsequence_check(current_word, other_word):
                    is_subsequence = True
                    break
        
        # Only add to the candidate list if it's not a subsequence
        if not is_subsequence:
            potential_candidates.append(current_word)

    if not potential_candidates:
        return -1

    longest_length = -1
    for candidate in potential_candidates:

        # Find the maximum length subsequence
        longest_length = max(longest_length, len(candidate))

    return longest_length

def is_subsequence_check(subsequence, sequence):
    subsequence_index = 0
    sequence_index = 0

    while subsequence_index < len(subsequence) and sequence_index < len(sequence):
        if subsequence[subsequence_index] == sequence[sequence_index]:
            subsequence_index += 1

        sequence_index += 1

    # If we've reached the end of subsequence, it's a subsequence
    return subsequence_index == len(subsequence)

Big(O) Analysis

Time Complexity
O(n² * m)The provided approach iterates through each of the n words in the input list. For each word, it compares it against every other word (n-1 other words) to check if it is a subsequence. The subsequence check itself takes O(m) time, where m is the average length of the strings, as it potentially iterates through the characters of both strings being compared. Thus, the overall time complexity is O(n * n * m), which simplifies to O(n² * m).
Space Complexity
O(1)The provided plain English solution primarily involves iterating through the input list and performing comparisons. It doesn't explicitly mention the creation of auxiliary data structures like temporary lists, hash maps, or recursive function calls to store intermediate results. Only a few variables are used for comparisons and keeping track of the longest uncommon subsequence, irrespective of the input size N (the number of words). Thus, the space complexity remains constant, or O(1).

Optimal Solution

Approach

The key idea is to find the longest string that is not a subsequence of any other string in the list. We will iterate through each string and check if it's an uncommon subsequence. To avoid redundant checks, we'll sort the strings by length in descending order, as a longer string is more likely to be the longest uncommon subsequence.

Here's how the algorithm would work step-by-step:

  1. Sort the list of strings from longest to shortest.
  2. For each string in the sorted list, check if it's a subsequence of any other longer string that comes before it in the list.
  3. If a string is not a subsequence of any longer string, then it's a candidate for the longest uncommon subsequence.
  4. If a string is an uncommon subsequence (i.e., not a subsequence of any other string), then we've found our answer, and we can stop.
  5. If no string meets the criteria after checking the entire list, then no uncommon subsequence exists, and the answer is -1.

Code Implementation

def is_subsequence(subsequence, full_string):
    subsequence_index = 0
    full_string_index = 0

    while subsequence_index < len(subsequence) and full_string_index < len(full_string):
        if subsequence[subsequence_index] == full_string[full_string_index]:
            subsequence_index += 1
        full_string_index += 1

    return subsequence_index == len(subsequence)

def longest_uncommon_subsequence(strings):
    strings.sort(key=len, reverse=True)

    # Iterate through sorted strings
    for i in range(len(strings)):
        is_uncommon = True

        # Check against longer strings before it
        for j in range(len(strings)):
            if i == j:
                continue

            if is_subsequence(strings[i], strings[j]) and len(strings[j]) >= len(strings[i]):
                is_uncommon = False
                break

        # Found the longest uncommon subsequence
        if is_uncommon:
            return len(strings[i])

    # No uncommon subsequence found
    return -1

Big(O) Analysis

Time Complexity
O(n*m*x)The algorithm first sorts the input array of strings, which takes O(n log n) time, where n is the number of strings. The core logic involves iterating through the sorted array (outer loop: n strings). For each string, it checks if it's a subsequence of any preceding (longer) string (inner loop: up to n strings). The subsequence check itself can take O(m*x) time, where m is the average length of the longer strings and x is the length of current string in the outer loop and we can approximate that as the average length of strings in the input. So, subsequence check takes O(m*x) on average. Therefore the dominant factor is the nested loops with subsequence check leading to O(n * n * m*x). Because the length 'm*x' is not related to n, we can simply it to O(n^2 * m * x) which further can be simplified to O(n*m*x) where n is the count of input strings, m and x are the lengths of the strings. The sorting cost, O(n log n), is dominated by the subsequence checks within nested loops.
Space Complexity
O(1)The primary space complexity stems from the sorting operation. While the plain English explanation doesn't specify the sorting algorithm used, a typical in-place sorting algorithm could be employed (e.g., heapsort), modifying the input list directly and using minimal extra space. The algorithm also uses a few constant space variables to iterate through the sorted list and perform subsequence checks. Therefore, the auxiliary space used remains constant regardless of the number of strings in the input list.

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn -1 immediately as there are no strings to compare.
Array with a single stringReturn the length of the single string as it's trivially the longest uncommon subsequence.
Array with all identical stringsReturn -1 as no string can be an uncommon subsequence of the others.
Long strings potentially causing integer overflow when calculating lengths or comparing stringsEnsure that lengths are stored in a data type large enough to accommodate the maximum string length and that the subsequence check doesn't have quadratic or worse performance.
Input array with some strings that are subsequences of others, but none are strictly uncommonThe algorithm should correctly identify that no uncommon subsequence exists and return -1.
Maximum input size (large number of strings and each string has max length)Ensure the time complexity of the subsequence check does not lead to time limit exceeded errors, potentially requiring optimization or a more efficient algorithm.
Strings with Unicode charactersThe subsequence check must correctly handle Unicode characters, ensuring accurate comparison and length calculations.
Strings of vastly different lengths within the input arrayThe algorithm must efficiently compare strings of highly varying lengths without unnecessary computations.