Taro Logo

Find the Longest Substring Containing Vowels in Even Counts

Medium
Meta logo
Meta
1 view
Topics:
StringsBit ManipulationArrays

Given the string s, return the size of the longest substring containing each vowel an even number of times. That is, 'a', 'e', 'i', 'o', and 'u' must appear an even number of times.

For example:

s = "eleetminicoworoep" Output: 13 Explanation: The longest substring is "leetminicowor" which contains two each of the vowels: e, i and o and zero of the vowels: a and u.

s = "leetcodeisgreat" Output: 5 Explanation: The longest substring is "leetc" which contains two e's.

s = "bcbcbc" Output: 6 Explanation: In this case, the given string "bcbcbc" is the longest because all vowels: a, e, i, o and u appear zero times.

Constraints:

  • 1 <= s.length <= 5 x 10^5
  • s contains only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum possible length of the input string `s`?
  2. What should I return if the input string `s` is empty?
  3. What should I return if no substring exists within `s` that contains each vowel an even number of times?
  4. Does the input string `s` only contain lowercase English letters, or can it contain other characters?
  5. By substring, do you mean a contiguous sequence of characters within the string `s`?

Brute Force Solution

Approach

The brute force method for this problem involves examining every possible substring within the given string. For each substring, we count the number of times each vowel appears. We then check if all vowels appear an even number of times, and if so, we compare the length of that substring with the longest valid substring found so far.

Here's how the algorithm would work step-by-step:

  1. Consider every possible section of the main string, starting from short sections and gradually increasing their length.
  2. For each of these sections, count how many times each vowel (a, e, i, o, u) appears.
  3. Check if each vowel appears an even number of times in that section. For example, zero, two, or four times are all even.
  4. If all vowels appear an even number of times, compare the length of this section with the length of the longest section we've found so far that also meets this even-vowel count criteria.
  5. If the current section is longer, remember its length (and optionally the section itself) as the new longest valid section.
  6. After checking all possible sections, the section we remembered as the longest one is the answer.

Code Implementation

def find_longest_substring_with_even_vowel_counts_brute_force(input_string):
    longest_substring_length = 0

    for substring_start_index in range(len(input_string)):
        for substring_end_index in range(substring_start_index, len(input_string)):
            substring = input_string[substring_start_index:substring_end_index + 1]

            # Initialize vowel counts for this substring
            vowel_counts = {
                'a': 0,
                'e': 0,
                'i': 0,
                'o': 0,
                'u': 0
            }

            for char in substring:
                if char in 'aeiou':
                    vowel_counts[char] += 1

            # Check if all vowel counts are even
            all_vowels_even = True

            for vowel_count in vowel_counts.values():
                if vowel_count % 2 != 0:
                    all_vowels_even = False
                    break

            # If all vowels appear an even number of times,
            # update the longest substring length if needed
            if all_vowels_even:

                current_substring_length = len(substring)

                # Check if this is the longest valid substring we've seen
                if current_substring_length > longest_substring_length:
                    longest_substring_length = current_substring_length

    return longest_substring_length

Big(O) Analysis

Time Complexity
O(n³)The described brute force algorithm iterates through all possible substrings of the input string of length n. There are approximately n * (n+1) / 2 possible substrings, which is O(n²). For each substring, we count the occurrences of the five vowels, which takes O(n) time in the worst case (when the substring is the whole string). Thus, the total time complexity is O(n² * n), which simplifies to O(n³).
Space Complexity
O(1)The provided brute force solution examines substrings and counts vowel occurrences. To count the vowels within each substring, it uses a fixed number of integer variables (one for each vowel) irrespective of the input string's length. Since the number of vowels is constant (five), the space required for the vowel counts remains constant, regardless of the input size N (the length of the string). No additional data structures that scale with the input size are used. Thus, the space complexity is O(1).

Optimal Solution

Approach

The efficient approach focuses on tracking the parity (even or odd) of vowel counts within the substring. By cleverly using this information, we can identify potential longest substrings that satisfy the even count requirement, avoiding the need to check every possible substring.

Here's how the algorithm would work step-by-step:

  1. Imagine you have a record of whether each vowel has appeared an even or odd number of times so far in the string. Think of this record as a unique identifier.
  2. Start at the beginning of the string and move character by character. Update your vowel count parity record as you go.
  3. If you ever encounter the same vowel count parity record twice, it means the substring between those two occurrences has an even number of each vowel.
  4. Keep track of the longest such substring you find as you move through the string.
  5. To make it fast, store where you first saw each unique vowel count parity record so that you can quickly calculate the length of a valid substring when you see the same record again.
  6. The longest valid substring is your answer.

Code Implementation

def find_longest_substring_containing_vowels_in_even_counts(input_string):
    vowel_indices = {'a': 0, 'e': 1, 'i': 2, 'o': 3, 'u': 4}
    string_length = len(input_string)
    max_length = 0

    # Store the first occurrence of each vowel parity record.
    first_occurrence = {0: -1}
    vowel_parity = 0

    for index in range(string_length):
        character = input_string[index]

        if character in vowel_indices:
            vowel_index = vowel_indices[character]
            vowel_parity ^= (1 << vowel_index)

            # If this parity has been seen before
            if vowel_parity in first_occurrence:
                max_length = max(max_length, index - first_occurrence[vowel_parity])
            else:
                # Store the index of first occurance
                first_occurrence[vowel_parity] = index

        else:
            #if parity not seen store the index
            if vowel_parity not in first_occurrence:
                first_occurrence[vowel_parity] = index

            if vowel_parity in first_occurrence:
                max_length = max(max_length, index - first_occurrence[vowel_parity])

    return max_length

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length n once to update the vowel count parity record. A hash map (or similar data structure) is used to store the first occurrence of each unique vowel count parity record. Hash map operations (lookup, insertion) take O(1) time on average. Since the string is traversed only once and each character is processed in constant time, the overall time complexity is O(n).
Space Complexity
O(1)The algorithm uses a hash map (or an array acting as a hash map) to store the first occurrence of each vowel count parity record. Each vowel can be either even or odd, so there are 2 possible states for each vowel. With 5 vowels, there are 2^5 = 32 possible parity records. The hash map therefore stores at most 32 entries, regardless of the length N of the input string. Thus, the auxiliary space is constant.

Edge Cases

CaseHow to Handle
Null or empty input stringReturn 0 immediately, as an empty string trivially satisfies the condition of even vowel counts.
Input string with no vowelsThe entire string is a valid substring since all vowels have a count of 0 (even), so return the string's length.
Input string with all vowels having even countsThe entire string is a valid substring, so return the string's length.
Input string of length 1If the single character is a vowel, return 0; otherwise, return 1.
Input string with only one vowel appearing an odd number of timesThe longest valid substring will exclude that single vowel, requiring us to search for valid substrings within the string.
Input string with all vowels appearing an odd number of timesThe longest valid substring would need to exclude at least one occurrence of each vowel to make the counts even; our algorithm needs to efficiently explore all substrings.
Long input string to ensure the algorithm has O(n) time complexity.Prefix sum and bit manipulation allows us to determine if a given substring has even counts of all vowels, so the runtime should be O(n).
Input string with a vowel appearing many times consecutively at the beginning or end.The algorithm should handle consecutive vowels correctly, updating the bitmask based on the vowel counts so the substring with the longest valid length is correctly computed.