You are given an array arr
of size n
consisting of non-empty strings.
Find a string array answer
of size n
such that:
answer[i]
is the shortest substring of arr[i]
that does not occur as a substring in any other string in arr
. If multiple such substrings exist, answer[i]
should be the lexicographically smallest. And if no such substring exists, answer[i]
should be an empty string.Return the array answer
.
Example 1:
Input: arr = ["cab","ad","bad","c"]
Output: ["ab","","ba",""]
Explanation:
- For the string "cab", the shortest substring that does not occur in any other string is either "ca" or "ab", we choose the lexicographically smaller substring, which is "ab".
- For the string "ad", there is no substring that does not occur in any other string.
- For the string "bad", the shortest substring that does not occur in any other string is "ba".
- For the string "c", there is no substring that does not occur in any other string.
Example 2:
Input: arr = ["abc","bcd","abcd"]
Output: ["","","abcd"]
Explanation:
- For the string "abc", there is no substring that does not occur in any other string.
- For the string "bcd", there is no substring that does not occur in any other string.
- For the string "abcd", the shortest substring that does not occur in any other string is "abcd".
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
We need to find the shortest string that doesn't appear as part of any of the other strings in our collection. The brute force approach involves checking every possible string, starting with the shortest, until we find one that works.
Here's how the algorithm would work step-by-step:
def find_shortest_uncommon_substring(strings):
alphabet = 'abcdefghijklmnopqrstuvwxyz'
max_length = max(len(string) for string in strings)
for substring_length in range(1, max_length + 2):
# Iterate through possible substring lengths.
for substring in generate_all_strings(substring_length, alphabet):
is_substring_common = False
for string_in_list in strings:
# Check if the generated substring
# exists inside the current string.
if substring in string_in_list:
is_substring_common = True
break
if not is_substring_common:
# If substring isn't in
# any of the strings, return it.
return substring
return None
def generate_all_strings(string_length, alphabet):
if string_length == 0:
yield ''
return
for character in alphabet:
for suffix in generate_all_strings(string_length - 1, alphabet):
yield character + suffix
The trick is to find the shortest sequence of characters that's not present in any of the given words. We'll start by checking short sequences and gradually increase the length until we find one that works, avoiding unnecessary work.
Here's how the algorithm would work step-by-step:
def find_shortest_uncommon_substring(words):
character_set = 'abcdefghijklmnopqrstuvwxyz'
max_word_length = max(len(word) for word in words)
for substring_length in range(1, max_word_length + 2):
# Start checking substrings of increasing length.
for start_index in range(len(character_set) ** substring_length):
substring = ""
temp_index = start_index
for _ in range(substring_length):
substring = character_set[temp_index % len(character_set)] + substring
temp_index //= len(character_set)
is_substring_present = False
# Check if the current substring is present in any of the words.
for word in words:
if substring in word:
is_substring_present = True
break
# If the substring is not present in any word, we found our answer.
if not is_substring_present:
return substring
return None
Case | How to Handle |
---|---|
Input array is null or empty | Return an empty string as there are no strings to compare, or throw an IllegalArgumentException based on requirements. |
Array contains an empty string | Treat the empty string as a valid string for comparison purposes; shortest uncommon substring will never be ''. |
All strings in the array are identical | The shortest uncommon substring will be the first character of the string, or an empty string if the string is empty. |
No uncommon substring exists (all substrings are common) | Return an empty string or a specific indicator (e.g., null, -1) to signal the absence of a solution based on the spec. |
Array contains very long strings (approaching memory limits) | Be mindful of memory usage when generating substrings and consider a streaming approach if memory is constrained. |
Array contains strings with special characters or Unicode | Ensure that the substring comparison and generation logic correctly handles these characters. |
Multiple shortest uncommon substrings exist | Return any one of the shortest uncommon substrings or all of them depending on the specification. |
Integer overflow when calculating substring lengths | Use appropriate data types (e.g., long) to prevent potential overflow when comparing lengths, or ensure strings are limited in length. |