Taro Logo

Find Longest Awesome Substring

Hard
Google logo
Google
2 views
Topics:
StringsBit Manipulation

You are given a string s consisting of digits. An awesome substring is a non-empty substring of s such that we can make any number of swaps in order to make it a palindrome. Return the length of the maximum length awesome substring of s.

For example:

  • s = "3242415". The output is 5 because "24241" is the longest awesome substring, and we can form the palindrome "24142" with some swaps.
  • s = "12345678". The output is 1, since no substring has all even counts except single digits.
  • s = "213123". The output is 6 because "213123" is the longest awesome substring, and we can form the palindrome "231132" with some swaps.

Write an algorithm to efficiently find the length of the maximum length awesome substring.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string?
  2. Is the input string guaranteed to contain only digits?
  3. What should I return if no awesome substring is found?
  4. By 'awesome substring', do you mean a substring where at most one digit appears an odd number of times?
  5. If there are multiple awesome substrings of the same maximum length, can I return any one of them?

Brute Force Solution

Approach

The brute force approach to this problem is all about trying every possible substring within the given string. We check each substring to see if it meets the 'awesome' criteria, and then keep track of the longest one we've found so far.

Here's how the algorithm would work step-by-step:

  1. Start by considering every possible substring of the input string.
  2. This means checking substrings of length one, then length two, then length three, and so on, all the way up to the full length of the string.
  3. For each substring, determine if it is an 'awesome' substring based on some given definition of what makes a string 'awesome'.
  4. If the substring is awesome, compare its length to the length of the longest awesome substring found so far.
  5. If the current awesome substring is longer than the longest one found so far, update the longest awesome substring to be the current substring.
  6. Once all possible substrings have been examined, the longest awesome substring that was found is the result.

Code Implementation

def find_longest_awesome_substring_brute_force(input_string):
    longest_awesome_substring = ""

    for substring_length in range(1, len(input_string) + 1):
        for start_index in range(len(input_string) - substring_length + 1):
            end_index = start_index + substring_length
            current_substring = input_string[start_index:end_index]

            # The substring has to be checked for being awesome
            if is_awesome_substring(current_substring):

                # We must update if current is larger
                if len(current_substring) > len(longest_awesome_substring):

                    longest_awesome_substring = current_substring

    return longest_awesome_substring

def is_awesome_substring(substring):
    digit_counts = [0] * 10
    for character in substring:
        digit = int(character)
        digit_counts[digit] += 1

    odd_count = 0
    for count in digit_counts:
        if count % 2 != 0:
            odd_count += 1

    # An awesome string can have at most one digit with an odd count.
    return odd_count <= 1

Big(O) Analysis

Time Complexity
O(n^3)The algorithm iterates through all possible substrings of the input string. Generating all substrings takes O(n^2) time because for each starting position (n possibilities), there are approximately n ending positions. For each substring, we need to check if it's 'awesome', which, based on the problem definition (not provided but implied by the strategy), involves examining the substring's characters. Assuming this check takes O(n) time (in the worst case where we need to iterate through the entire substring), the overall time complexity becomes O(n^2) * O(n) = O(n^3).
Space Complexity
O(1)The brute force approach described only iterates through substrings using loop indices. No auxiliary data structures like arrays, hash maps, or sets are created to store intermediate results or visited substrings. The algorithm only needs a few constant-size variables to track loop indices and the longest awesome substring found so far. Thus, the space used does not depend on the size of the input string, N, and remains constant.

Optimal Solution

Approach

The key to finding the longest awesome substring efficiently lies in understanding that an awesome substring has at most one digit appearing an odd number of times. We use a clever trick to keep track of which digits appear an odd number of times so far and quickly check if a substring is awesome.

Here's how the algorithm would work step-by-step:

  1. Imagine each digit from 0 to 9 represents a light switch. We start with all switches off.
  2. As we go through the string, for each digit we see, we flip the corresponding switch. If a digit appears an even number of times, its switch will be off. If it appears an odd number of times, its switch will be on.
  3. We use a special number (a bitmask) to represent the state of all the switches. Each digit corresponds to a specific position in this number.
  4. We keep track of all the bitmask values we've seen so far, along with where we saw them.
  5. For each point in the string, we look back to see if there's a previous point where either the bitmask is the same (meaning all digits have appeared an even number of times in between), or the bitmask differs by only one switch being flipped (meaning only one digit has appeared an odd number of times in between).
  6. If we find such a point, the substring between those points is an awesome substring. We remember the longest awesome substring we've found so far.

Code Implementation

def find_longest_awesome_substring(input_string):
    string_length = len(input_string)
    first_occurrence = {0: -1}
    current_mask = 0
    max_length = 0

    # We iterate to generate bitmasks
    for index in range(string_length):
        digit = int(input_string[index])
        current_mask ^= (1 << digit)

        # If a mask has been seen before, update the max length
        if current_mask in first_occurrence:
            max_length = max(max_length, index - first_occurrence[current_mask])
        else:
            first_occurrence[current_mask] = index

        # Check all possible one-digit flips
        for digit_to_flip in range(10):
            flipped_mask = current_mask ^ (1 << digit_to_flip)
            if flipped_mask in first_occurrence:
                max_length = max(max_length, index - first_occurrence[flipped_mask])

    return max_length

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input string of length n once. Inside the loop, a constant amount of work is performed: updating the bitmask, checking the hashtable for previous occurrences of the current bitmask and variations with one bit flipped, and updating the maximum length. Because the work done inside the loop is constant and the loop runs n times, the overall time complexity is O(n).
Space Complexity
O(1)The algorithm uses a hash map (or similar data structure) to store bitmask values and their first occurrences. In the worst case, it might store a bitmask for each index in the string. Since each digit from 0 to 9 are represented by a bit and there is at most one odd occurring digit allowed in each substring, the maximum number of distinct bitmasks is limited to 2^10 (1024). Thus, the space used by the hash map is independent of the input string's length N. Therefore the space complexity is O(1).

Edge Cases

CaseHow to Handle
Null or empty input stringReturn 0, as there's no substring to evaluate.
Input string with only one characterReturn 1, as a single character is trivially an awesome substring.
Input string with maximum allowed length (scalability)The bitmask approach with prefix XOR handles large inputs efficiently due to its O(N) time complexity, avoiding timeouts.
Input string containing all identical charactersThe algorithm correctly identifies awesome substrings even with repeated characters, as it depends on odd/even counts.
Input string where no awesome substring existsThe algorithm correctly calculates distances between odd-count states and returns 0 if no awesome substring exists, which happens when no two prefixes have same bitmask.
Input with multiple longest awesome substringsThe algorithm finds the *longest* awesome substring, not necessarily all of them, but correctly returns the length of one longest substring.
Integer overflow in length calculations (language-specific)Since the problem states maximum string length of 10^5, integer overflow is not a concern when calculating length differences.
Input string contains non-numeric charactersThe problem states input string consists of digits, so non-numeric characters are not considered, and solution directly casts each character to integer.