Find the Longest Substring Containing Vowels in Even Counts

Medium
12 days ago

Given the string s, return the size of the longest substring containing each vowel an even number of times. That is, 'a', 'e', 'i', 'o', and 'u' must appear an even number of times.

 

Example 1:

Input: s = "eleetminicoworoep"
Output: 13
Explanation: The longest substring is "leetminicowor" which contains two each of the vowels: e, i and o and zero of the vowels: a and u.

Example 2:

Input: s = "leetcodeisgreat"
Output: 5
Explanation: The longest substring is "leetc" which contains two e's.

Example 3:

Input: s = "bcbcbc"
Output: 6
Explanation: In this case, the given string "bcbcbc" is the longest because all vowels: a, e, i, o and u appear zero times.

 

Constraints:

  • 1 <= s.length <= 5 x 10^5
  • s contains only lowercase English letters.
Sample Answer
def longest_substring_with_even_vowels(s: str) -> int:
    """Calculates the length of the longest substring containing each vowel an even number of times.

    Args:
        s: The input string consisting of lowercase English letters.

    Returns:
        The length of the longest substring that satisfies the condition.
    """
    vowels = "aeiou"
    vowel_indices = {vowel: i for i, vowel in enumerate(vowels)}
    mask_map = {0: -1}
    mask = 0
    max_len = 0

    for i, char in enumerate(s):
        if char in vowel_indices:
            vowel_index = vowel_indices[char]
            mask ^= (1 << vowel_index)

        if mask in mask_map:
            max_len = max(max_len, i - mask_map[mask])
        else:
            mask_map[mask] = i

    return max_len


# Example Usage and Tests:
if __name__ == '__main__':
    # Example 1
    s1 = "eleetminicoworoep"
    result1 = longest_substring_with_even_vowels(s1)
    print(f"Input: {s1}, Output: {result1}")  # Output: 13

    # Example 2
    s2 = "leetcodeisgreat"
    result2 = longest_substring_with_even_vowels(s2)
    print(f"Input: {s2}, Output: {result2}")  # Output: 5

    # Example 3
    s3 = "bcbcbc"
    result3 = longest_substring_with_even_vowels(s3)
    print(f"Input: {s3}, Output: {result3}")  # Output: 6

    # Additional test cases
    s4 = "aeiouaeiou"
    result4 = longest_substring_with_even_vowels(s4)
    print(f"Input: {s4}, Output: {result4}")  # Output: 10

    s5 = "aeeiou"
    result5 = longest_substring_with_even_vowels(s5)
    print(f"Input: {s5}, Output: {result5}")  # Output: 0

    s6 = ""
    result6 = longest_substring_with_even_vowels(s6)
    print(f"Input: {s6}, Output: {result6}")  # Output: 0

    s7 = "abacaba"
    result7 = longest_substring_with_even_vowels(s7)
    print(f"Input: {s7}, Output: {result7}")  # Output: 0

Naive Approach

A brute-force approach would involve checking every possible substring of the given string to see if it meets the even vowel count criteria. For each substring, we count the occurrences of each vowel and determine if they are all even. This approach has a high time complexity, making it inefficient for large strings.

Optimal Approach

To solve this problem efficiently, we can use a bit manipulation and prefix sum approach. Here's the breakdown:

  1. Vowel Mapping: Assign each vowel a unique bit position (e.g., a=0, e=1, i=2, o=3, u=4).
  2. Mask Creation: Iterate through the string. If a vowel is encountered, toggle the corresponding bit in a mask. The mask represents the parity (even or odd) of vowel counts.
  3. Hash Map: Store the mask along with its index in a hash map. If the same mask appears again, it means the vowels between the two indices have even counts.
  4. Maximum Length: Keep track of the maximum length of the substring found so far.

Big(O) Run-Time Analysis

  • Time Complexity: O(n), where n is the length of the string. The algorithm iterates through the string once. Hash map operations (lookup and insertion) take O(1) time on average.

    • The outer loop iterates through each character of the string s, which takes O(n) time.
    • Inside the loop, the operations such as checking if a character is in vowel_indices, XORing the mask, and hash map lookups/insertions take O(1) time on average.
    • Therefore, the overall time complexity is dominated by the loop, resulting in O(n).

Big(O) Space Usage Analysis

  • Space Complexity: O(1) (more precisely, O(2^k) where k is the number of vowels, in this case 5, so O(32)).
    • The vowel_indices dictionary stores the indices of vowels, which takes O(1) space because the number of vowels is constant.
    • The mask_map hash map stores at most 25 = 32 different masks (since each of the 5 vowels can be either even or odd), which is constant space.
    • The mask variable takes constant space.
    • Therefore, the space complexity is O(1).

Edge Cases

  1. Empty String: If the input string is empty, the function should return 0, as there are no substrings to evaluate.
  2. String with No Vowels: If the string contains no vowels, the entire string is a valid substring, and its length should be returned.
  3. String with Only Even Vowel Counts: If the entire string has an even number of each vowel, its length should be returned.
  4. String with Only Odd Vowel Counts: If no substring has an even count of all vowels, the function should return 0.
  5. Very Long Strings: The algorithm should efficiently handle very long strings (up to 5 x 10^5 characters) without exceeding time or memory limits.