Taro Logo

Count Vowel Strings in Ranges

Medium
PayPal logo
PayPal
0 views
Topics:
ArraysStringsTwo Pointers

You are given a 0-indexed array of strings words and a 2D array of integers queries.

Each query queries[i] = [li, ri] asks us to find the number of strings present at the indices ranging from li to ri (both inclusive) of words that start and end with a vowel.

Return an array ans of size queries.length, where ans[i] is the answer to the ith query.

Note that the vowel letters are 'a', 'e', 'i', 'o', and 'u'.

Example 1:

Input: words = ["aba","bcb","ece","aa","e"], queries = [[0,2],[1,4],[1,1]]
Output: [2,3,0]
Explanation: The strings starting and ending with a vowel are "aba", "ece", "aa" and "e".
The answer to the query [0,2] is 2 (strings "aba" and "ece").
to query [1,4] is 3 (strings "ece", "aa", "e").
to query [1,1] is 0.
We return [2,3,0].

Example 2:

Input: words = ["a","e","i"], queries = [[0,2],[0,1],[2,2]]
Output: [3,2,1]
Explanation: Every string satisfies the conditions, so we return [3,2,1].

Constraints:

  • 1 <= words.length <= 105
  • 1 <= words[i].length <= 40
  • words[i] consists only of lowercase English letters.
  • sum(words[i].length) <= 3 * 105
  • 1 <= queries.length <= 105
  • 0 <= li <= ri < words.length

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the size of `words` and `queries` arrays and the length of the strings within `words`?
  2. Can the `start_i` and `end_i` indices in `queries` be out of bounds (e.g., negative or greater than the length of `words` - 1)? What should I return in that case?
  3. Are the strings in `words` guaranteed to contain only lowercase English letters, or can they contain uppercase letters, numbers, or special characters?
  4. Is the `queries` array guaranteed to be valid, meaning that `start_i` will always be less than or equal to `end_i` for each query?
  5. If a string in `words` is empty, should it be considered to start and end with a vowel?

Brute Force Solution

Approach

The goal is to figure out, for different groups of words in a bigger list of words, how many start and end with vowels. The basic way is to check each group individually to see if it fits the rules.

Here's how the algorithm would work step-by-step:

  1. For each group of words we're interested in, we'll look at the first word in the group.
  2. We'll check if that first word starts with a vowel. If it doesn't, we move on to the next word in that group.
  3. If it does start with a vowel, we'll then look at the very last word in that group.
  4. We'll check if that last word ends with a vowel. If it doesn't, we move on to the next group.
  5. If both the first and last words in the group start and end with vowels, we increase our count.
  6. We do this for every single group of words we're interested in.
  7. Finally, we report the number of groups where both the first and last words start and end with vowels.

Code Implementation

def count_vowel_strings_in_ranges_brute_force(words, queries):
    result = []
    for query in queries:
        start_index = query[0]
        end_index = query[1]
        count = 0

        # Iterate through each range defined by the queries
        for i in range(start_index, end_index + 1):
            first_word = words[i]
            last_word = words[i]
            vowels = "aeiou"

            # Check if the word starts with a vowel
            if first_word[0] in vowels:

                #Check if the word ends with a vowel
                if last_word[-1] in vowels:
                    count += 1

        result.append(count)
    return result

Big(O) Analysis

Time Complexity
O(m * k)The outer loop iterates through 'm' ranges provided in the input. For each range, the algorithm checks the first and last word within that range. Checking if a word starts or ends with a vowel takes constant time, O(1). In the worst case, we iterate through all k words specified by the range. Thus, we have O(1) operation inside a loop that runs at most k times for each of the m ranges. Therefore the overall time complexity is O(m * k).
Space Complexity
O(1)The algorithm iterates through ranges and checks the first and last words in each range. It uses a constant number of variables to store the count of vowel strings and possibly some temporary variables within each iteration. Since the number of variables used does not depend on the number of words or ranges (N), the space complexity is constant. No auxiliary data structures that scale with the input size are used.

Optimal Solution

Approach

The efficient approach involves pre-calculating some helpful information to avoid redundant work. It focuses on quickly answering each question about vowel counts in specified sections, rather than recalculating from scratch each time.

Here's how the algorithm would work step-by-step:

  1. First, create a new list that stores how many words starting with vowels are present up to each position in the original list of words.
  2. For example, the first number in this new list tells you how many vowel-starting words there are up to the first word. The second number tells you how many vowel-starting words there are up to the second word, and so on.
  3. To answer a question about the count of vowel-starting words in a particular range, subtract the count at the beginning of the range from the count at the end of the range. This quickly gives you the answer without having to individually check each word in that range.
  4. Take into account the edge case where the left value of the range is the start, so that you can account for that vowel count

Code Implementation

def count_vowel_strings_in_ranges(words, queries):
    prefix_vowel_counts = [0] * (len(words) + 1)
    vowels = "aeiou"

    # Create prefix sum array for efficient range queries
    for index in range(len(words)):
        prefix_vowel_counts[index + 1] = prefix_vowel_counts[index] 
        if words[index][0] in vowels:
            prefix_vowel_counts[index + 1] += 1

    results = []
    for start, end in queries:
        # Use precomputed counts to answer each query
        count = prefix_vowel_counts[end + 1] - prefix_vowel_counts[start]

        results.append(count)

    return results

Big(O) Analysis

Time Complexity
O(n + m)The algorithm first iterates through the input list of words of size n to create a prefix sum array storing the cumulative count of vowel-starting words. This takes O(n) time. Then, it iterates through the list of m ranges, and for each range, it performs a constant-time subtraction operation to find the count of vowel-starting words in that range. This range processing takes O(m) time. Therefore, the overall time complexity is O(n + m), where n is the number of words and m is the number of ranges.
Space Complexity
O(N)The algorithm creates a new list to store the cumulative counts of vowel-starting words up to each position in the original list. This new list has the same length as the original list of words, which we can denote as N. Therefore, the auxiliary space required is proportional to the number of words in the input list. Hence, the space complexity is O(N).

Edge Cases

CaseHow to Handle
words array is null or emptyReturn an empty list because no counts can be calculated.
queries array is null or emptyReturn an empty list as there are no queries to process.
A word in words is an empty stringTreat an empty string as not starting and ending with a vowel, resulting in zero count.
A word in words is a single character that is a vowelIt should be counted as a valid string.
A word in words contains non-alphabetic charactersThe solution should only consider the first and last alphabetic characters of the word when checking for vowels.
queries array contains out-of-bounds indicesCheck that start_i and end_i are within the bounds of the words array (0 <= start_i <= end_i < words.length) and handle invalid indices appropriately, like clamping them to array boundaries or skipping such queries.
queries array contains start_i > end_iHandle such cases by either swapping the start and end indices or returning 0 count for that query.
Large input sizes (large words array or large queries array)Optimize the solution for time complexity, potentially using prefix sums or memoization to efficiently compute the counts for each query.