Taro Logo

Count Binary Substrings #1 Most Asked

Easy
12 views
Topics:
ArraysStringsTwo Pointers

Given a binary string s, return the number of non-empty substrings that have the same number of 0's and 1's, and all the 0's and all the 1's in these substrings are grouped consecutively.

Substrings that occur multiple times are counted the number of times they occur.

Example 1:

Input: s = "00110011"
Output: 6
Explanation: There are 6 substrings that have equal number of consecutive 1's and 0's: "0011", "01", "1100", "10", "0011", and "01".
Notice that some of these substrings repeat and are counted the number of times they occur.
Also, "00110011" is not a valid substring because all the 0's (and 1's) are not grouped together.

Example 2:

Input: s = "10101"
Output: 4
Explanation: There are 4 substrings: "10", "01", "10", "01" that have equal number of consecutive 1's and 0's.

Constraints:

  • 1 <= s.length <= 105
  • s[i] is either '0' or '1'.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string `s`?
  2. Can the input string `s` be empty or null?
  3. Are there any characters other than '0' and '1' that can appear in the input string?
  4. If no valid substrings are found, should I return 0?
  5. Could you provide an example of an input string that would result in a specific output, to confirm my understanding?

Brute Force Solution

Approach

To solve this problem the brute force way, we look at every possible chunk of the input string. We will check each chunk to see if it satisfies a certain requirement, incrementing a counter if it does. Finally, the counter will represent the number of valid substrings.

Here's how the algorithm would work step-by-step:

  1. Consider every possible substring of the input string, starting with the shortest ones.
  2. For each substring, check if it is made up of two consecutive groups, one with all zeros and one with all ones, or vice versa (all ones followed by all zeros).
  3. For each valid substring, also verify if the sizes of the groups are equal.
  4. If a substring is both a consecutive group of zeros and ones and has equal-sized groups, we've found a valid binary substring, so we increment our total count.
  5. Repeat this process until we've checked all possible substrings.
  6. The final count represents the total number of binary substrings that meet the requirement.

Code Implementation

def count_binary_substrings_brute_force(input_string):
    number_of_binary_substrings = 0
    string_length = len(input_string)

    for substring_length in range(2, string_length + 1):
        for starting_index in range(string_length - substring_length + 1):
            substring = input_string[starting_index:starting_index + substring_length]
            
            #Check if the substring is valid and increment if so
            if is_valid_binary_substring(substring):
                number_of_binary_substrings += 1

    return number_of_binary_substrings

def is_valid_binary_substring(substring):
    group_size = 0
    
    # Determine the expected length of the groups for equality check
    if len(substring) % 2 != 0:
        return False
    group_size = len(substring) // 2

    first_char = substring[0]
    first_group = ''
    second_group = ''

    # Split the substring into two groups
    for index in range(len(substring)):
        if index < group_size:
            first_group += substring[index]
        else:
            second_group += substring[index]

    # Ensure both groups are not empty
    if not first_group or not second_group:
        return False

    #Ensure both groups are of same character type
    if not all(bit == first_group[0] for bit in first_group):
        return False
    if not all(bit == second_group[0] for bit in second_group):
        return False

    # Check if the groups are composed of alternating bits
    if first_group[0] == second_group[0]:
        return False

    return True

Big(O) Analysis

Time Complexity
O(n³)The algorithm iterates through all possible substrings of the input string. Generating each substring takes O(n) time because we need to copy a portion of the original string of length n. Since there are approximately n²/2 possible substrings (derived from a nested loop-like structure for start and end indices of the substring), the total time spent generating substrings becomes O(n³). For each substring, we validate if it meets the criteria of having equal sized groups of consecutive 0s and 1s, and this validation involves iterating through the substring, which takes O(n) time in the worst case. Therefore the overall time complexity is O(n³).
Space Complexity
O(1)The provided brute-force algorithm iterates through all possible substrings of the input string but does not utilize any auxiliary data structures to store intermediate results, track visited substrings, or build temporary collections. Only a few constant-size variables, such as the substring indices and a counter, are used. Thus, the algorithm's space complexity is independent of the input string's length N. Therefore, the auxiliary space complexity is O(1).

Optimal Solution

Approach

The optimal strategy focuses on identifying consecutive groups of similar digits. Instead of checking every possible substring, it leverages the pattern that valid substrings must have an equal count of adjacent 0s and 1s, and these groups must be adjacent to each other.

Here's how the algorithm would work step-by-step:

  1. First, identify and count the lengths of consecutive groups of 0s and 1s in the given string. For example, '00111001' would become groups of size 2, 3, 2, and 1.
  2. Then, consider each adjacent pair of these group lengths. For each pair, take the minimum of the two group lengths.
  3. Add up all these minimum values. This total is the number of valid binary substrings.

Code Implementation

def countBinarySubstrings(binary_string):
    group_lengths = []
    current_index = 0

    while current_index < len(binary_string):
        current_character = binary_string[current_index]
        current_length = 0

        # Count consecutive characters
        while current_index < len(binary_string) and binary_string[current_index] == current_character:
            current_length += 1
            current_index += 1

        group_lengths.append(current_length)

    number_of_substrings = 0

    # Iterate through adjacent group lengths
    for index in range(1, len(group_lengths)): 
        #Find the minimum length of adjacent groups
        number_of_substrings += min(group_lengths[index - 1], group_lengths[index])

    return number_of_substrings

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input string of length n once to identify and count consecutive groups of 0s and 1s. This group counting step takes O(n) time. Subsequently, it iterates through the array of group lengths, performing a minimum operation for each adjacent pair. Since there are at most n groups (as the string consists of n characters), this second step also takes O(n) time. Therefore, the overall time complexity is dominated by these linear iterations, resulting in O(n) complexity.
Space Complexity
O(N)The dominant space usage comes from storing the lengths of consecutive groups of 0s and 1s in the input string. In the worst-case scenario, the input string alternates between 0s and 1s (e.g., '010101...'), resulting in N groups, where N is the length of the input string. Therefore, we would need to store N group lengths. Consequently, the auxiliary space is proportional to the input size, leading to O(N) space complexity.

Edge Cases

Empty or null input string
How to Handle:
Return 0, as there are no substrings to evaluate.
String with a single character
How to Handle:
Return 0, as a single character cannot form a valid substring.
String with only one type of character (e.g., '0000' or '1111')
How to Handle:
Return 0, because no valid substring composed of consecutive groups of 0's and 1's can be formed.
String alternates characters rapidly (e.g., '01010101')
How to Handle:
This maximizes the number of valid substrings, and the algorithm should correctly count them.
String with a very long sequence of the same character followed by a very long sequence of the other character (e.g., '00000000001111111111')
How to Handle:
The number of valid substrings is limited by the shorter sequence length, and the algorithm should correctly calculate this.
Maximum input string length (scalability)
How to Handle:
The solution should have a linear time complexity (O(n)) to handle large input strings efficiently.
Input string contains characters other than '0' or '1'
How to Handle:
The solution should either throw an error or gracefully handle the unexpected characters, perhaps by ignoring them or replacing them with a default value and proceeding.
Integer overflow when calculating the number of substrings in a very long alternating string.
How to Handle:
Ensure that the data type used to store the count of substrings is large enough (e.g., long) to prevent integer overflow.
0/0 completed