Taro Logo

Number of Equal Count Substrings

Medium
Cisco logo
Cisco
2 views
Topics:
StringsSliding Windows

You are given a string s and an integer count. A string is considered good if it contains exactly count occurrences of every character that appears in it.

Return the number of good substrings of s.

Note:

  • A substring is a contiguous sequence of characters within a string.
  • The string s consists of only lowercase English letters.

Example 1:

Input: s = "abcabcabc", count = 3
Output: 3
Explanation: The substrings "abcabcabc", "bcabcabca" and "cabcabcab" are good.

Example 2:

Input: s = "abacaba", count = 1
Output: 4
Explanation: The substrings "abacaba", "bacaba", "acaba" and "caba" are good.

Example 3:

Input: s = "awwaawwaaw", count = 4
Output: 0
Explanation: There are no good substrings.

Constraints:

  • 1 <= s.length <= 105
  • 1 <= count <= s.length
  • s consists of only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the length of the string `s` and the value of `count`?
  2. If no substring satisfies the condition, what should I return (e.g., 0)?
  3. Can the input string `s` be empty or null?
  4. Is `count` guaranteed to be a positive integer, or could it be zero or negative?
  5. To clarify, a substring must contain only unique characters that each appear *exactly* `count` times, correct?

Brute Force Solution

Approach

The brute force approach for this problem is straightforward: we examine every possible substring within the given string. For each substring, we count how many times each character appears, and then check if the counts are all equal.

Here's how the algorithm would work step-by-step:

  1. Consider every possible substring in the string.
  2. For each substring, count the occurrence of each distinct character.
  3. Compare the counts of each character. If the counts are all equal, then the substring satisfies the condition.
  4. Keep a running tally of the number of substrings that satisfy the condition.
  5. After checking every substring, return the final tally.

Code Implementation

def number_of_equal_count_substrings_brute_force(input_string):
    string_length = len(input_string)
    equal_count_substrings = 0

    for start_index in range(string_length):
        for end_index in range(start_index, string_length):
            substring = input_string[start_index : end_index + 1]

            # Create a dictionary to store letter counts for the substring
            letter_counts = {}
            for char in substring:
                letter_counts[char] = letter_counts.get(char, 0) + 1

            # If the substring is empty, move to the next one
            if not letter_counts:
                continue

            # Get the count of the first letter to compare against
            first_letter_count = list(letter_counts.values())[0]

            is_equal = True

            # Check if all letters have the same count
            for letter_count in letter_counts.values():
                if letter_count != first_letter_count:
                    is_equal = False
                    break

            # Increment total count if equal counts were found
            if is_equal:
                equal_count_substrings += 1

    return equal_count_substrings

Big(O) Analysis

Time Complexity
O(n³)The outer loop iterates through all possible starting positions of substrings, which takes O(n) time. The inner loop iterates through all possible ending positions for each starting position, leading to another O(n) factor. Inside the inner loop, counting the occurrences of each distinct character in the substring takes O(n) time in the worst case where all characters are distinct and the substring is long. Therefore, the overall time complexity is O(n * n * n), which simplifies to O(n³).
Space Complexity
O(1)The brute force approach iterates through all substrings and for each substring, counts the occurrences of each distinct character. These character counts are stored in a temporary data structure (e.g., a hash map or an array). However, the size of this data structure is bounded by the number of distinct characters, which is independent of the input string's length, N. Therefore, the auxiliary space used is constant, resulting in O(1) space complexity.

Optimal Solution

Approach

The most efficient way to find equal count substrings is to use a sliding window technique. We keep track of the counts of each character within the window and adjust the window's size until we find a substring where all characters appear the same number of times. This avoids repeatedly checking the same substrings.

Here's how the algorithm would work step-by-step:

  1. Start with a small window at the beginning of the string.
  2. Keep track of how many times each different character appears inside the window.
  3. Check if all the characters that appear in the window appear the same number of times.
  4. If the counts are equal, increase your count of equal substrings.
  5. Move the window one position to the right.
  6. If the counts aren't equal, increase the window size until you find matching counts, or the window can't grow further.
  7. Keep moving the window and adjusting the size until you reach the end of the string.

Code Implementation

def number_of_equal_count_substrings(input_string):
    string_length = len(input_string)
    equal_count_substrings = 0

    for i in range(string_length):
        character_counts = {}
        for j in range(i, string_length):
            current_character = input_string[j]
            character_counts[current_character] = character_counts.get(current_character, 0) + 1

            # Check if all characters have the same count.
            if len(character_counts) == 0:
                equal_count_substrings += 1
                continue

            first_count = list(character_counts.values())[0]
            is_equal_count = True
            for count in character_counts.values():
                if count != first_count:
                    is_equal_count = False
                    break

            if is_equal_count:
                equal_count_substrings += 1

    return equal_count_substrings

def number_of_equal_count_substrings_optimized(input_string):
    string_length = len(input_string)
    equal_count_substrings = 0
    
    for i in range(string_length):
        character_counts = {}
        for j in range(i, string_length):
            current_character = input_string[j]
            character_counts[current_character] = character_counts.get(current_character, 0) + 1
            
            # Avoid division by zero if the substring is empty
            if not character_counts:
                equal_count_substrings += 1
                continue
            
            # All chars must have the same count to be a equal substring
            character_values = list(character_counts.values())
            if len(set(character_values)) <= 1:
                equal_count_substrings += 1
                
    return equal_count_substrings

def number_of_equal_count_substrings_optimal(input_string):
    string_length = len(input_string)
    equal_count_substrings = 0
    
    # Use a dict to keep track of count differences for fast lookup.
    count_differences = {tuple([0] * 26): 1}
    current_counts = [0] * 26
    
    for i in range(string_length):
        character_index = ord(input_string[i]) - ord('a')
        current_counts[character_index] += 1
        
        # Store the character count differences as a tuple
        differences = tuple(current_counts[j] - current_counts[0] for j in range(26))
        
        # If we've seen this difference before, it's an equal count substring
        if differences in count_differences:
            equal_count_substrings += count_differences[differences]
            count_differences[differences] += 1

        # Adds new count to hashmap if it does not exist
        else:
            count_differences[differences] = 1

    return equal_count_substrings

def number_of_equal_count_substrings_best(input_string):
    string_length = len(input_string)
    equal_count_substrings = 0
    
    # Dictionary to store the count differences.
    count_differences = {tuple([0] * 26): 1}
    current_counts = [0] * 26
    
    for i in range(string_length):
        character_index = ord(input_string[i]) - ord('a')
        current_counts[character_index] += 1
        
        # Key decision point: store the diff between counts as a tuple
        differences = tuple(current_counts[j] - current_counts[0] for j in range(26))
        
        # If we've seen this difference before, add to equal count
        if differences in count_differences:
            equal_count_substrings += count_differences[differences]
            count_differences[differences] += 1
        else:
            count_differences[differences] = 1
            
    return equal_count_substrings

Big(O) Analysis

Time Complexity
O(n^2)The outer loop iterates through each starting position of the substring, resulting in n iterations in the worst case. The inner loop expands the sliding window, and in the worst-case scenario, it may iterate close to n times for each starting position to find a valid substring or reach the end of the string. Inside the inner loop, we have to count characters which takes O(k) time where k is window size; since k can go up to n, the character counting can cost up to O(n). As a result, we are performing close to n calculations for each of the n starting positions so the overall time complexity is O(n^2).
Space Complexity
O(1)The algorithm uses a sliding window and keeps track of the counts of each character within the window. The number of distinct characters is limited by the character set of the input string (e.g., ASCII or Unicode). Therefore, the space to store the character counts is constant, independent of the input string length N. The algorithm doesn't create any auxiliary data structures that scale with the input size N, resulting in O(1) auxiliary space.

Edge Cases

CaseHow to Handle
Null or empty string sReturn 0, as no substrings can be formed from an empty string.
count is zeroIf count is zero, any substring consisting of only unique characters would be valid, so handle this case appropriately, likely returning the number of substrings consisting only of distinct characters, or zero if disallowed.
count is greater than the length of the stringReturn 0, as no character can appear 'count' times within a string shorter than 'count'.
String contains only one unique characterCheck if string length is a multiple of the given count; return 1 if true, otherwise 0.
String contains all unique characters with length > countReturn 0, as no substring will have each unique char appearing exactly count times.
Large string s and small countThe solution must efficiently handle a large number of substrings without exceeding time limits, so consider sliding window or optimized counting strategies.
Maximum count and string length to test for integer overflowEnsure that integer arithmetic used for counting does not overflow by using appropriate data types (e.g., long).
String with highly skewed character distributionThe character counting mechanism must handle skewed distributions efficiently; a hashmap is a suitable choice.