Taro Logo

Count Substrings Without Repeating Character

Medium
Asked by:
Profile picture
19 views
Topics:
StringsSliding Windows

Given a string s, find the number of substrings that do not contain any repeating characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Constraints:

  • 0 <= s.length <= 5 * 104
  • s consists of English letters, digits, symbols and spaces.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string `s`? Are there any specific constraints on the characters that `s` can contain (e.g., ASCII, Unicode, lowercase English letters only)?
  2. Is an empty string a valid input, and if so, what should the return value be?
  3. By 'substring', do you mean a contiguous sequence of characters? (Just confirming it's not a subsequence.)?
  4. If the string contains only repeating characters (e.g., 'aaaa'), what should the returned count be?
  5. Can you provide a small example and the expected output to illustrate the problem clearly?

Brute Force Solution

Approach

To count substrings without repeating characters using brute force, we'll examine every possible substring within the given text. For each substring, we'll check if it contains any repeated characters, and if it doesn't, we'll count it.

Here's how the algorithm would work step-by-step:

  1. Start by considering every single character as a substring.
  2. Then, consider all substrings of length two, then length three, and so on, until you reach the full length of the text.
  3. For each substring you create, check if any character appears more than once within that substring.
  4. If a substring has no repeated characters, mark it as valid and increase your count.
  5. Continue this process until you have checked all possible substrings.
  6. The final count represents the total number of substrings without repeating characters.

Code Implementation

def count_substrings_without_repeating_characters(input_string):
    string_length = len(input_string)
    substring_count = 0

    for starting_index in range(string_length):
        for ending_index in range(starting_index, string_length):
            # Extract substring based on start/end indices
            current_substring = input_string[starting_index:ending_index + 1]

            # Use a set to efficiently check for repeats.
            character_set = set()

            has_repeating_characters = False
            for character in current_substring:
                if character in character_set:
                    has_repeating_characters = True
                    break

                # Add the character to the set.
                character_set.add(character)

            # Only count substrings with no repeating chars
            if not has_repeating_characters:
                substring_count += 1

    return substring_count

Big(O) Analysis

Time Complexity
O(n³)The algorithm iterates through all possible substrings of the given text. The outer loop iterates from substring length 1 to n (text length). The inner loop iterates through all possible starting positions for a substring of that length, which is O(n) in the worst case. For each substring, the algorithm checks for repeating characters, which takes O(n) time in the worst case. Therefore, the overall time complexity is O(n * n * n) = O(n³).
Space Complexity
O(N)The brute force approach described requires checking each substring for repeating characters. To perform this check efficiently, a temporary data structure, such as a set or a hash map, could be used to store the characters in the current substring. In the worst-case scenario, the current substring being examined can be as long as the input string itself, requiring storage for up to N characters. Thus, the auxiliary space used grows linearly with the input size N, where N is the length of the input string. This gives a space complexity of O(N).

Optimal Solution

Approach

To efficiently count substrings without repeating characters, we employ a sliding window technique. This approach avoids redundant checks by expanding and contracting a window across the string, keeping track of characters within the window. We use this window to count valid substrings as we slide it across the entire input.

Here's how the algorithm would work step-by-step:

  1. Imagine a window that can grow and shrink as it moves across the text.
  2. Start with the window at the very beginning of the text.
  3. As the window moves, check to see if the new character is already inside the window.
  4. If the new character is unique, expand the window to include it, and count the number of substrings in the expanded window.
  5. If you find a repeated character, shrink the window from the beginning until that repeating character is no longer inside. Then expand the window again including the repeating character.
  6. Every time you shrink or expand the window, be sure to accurately count the number of unique substrings this change creates or eliminates.
  7. Keep sliding the window, expanding and shrinking as needed, until you've reached the end of the text. The accumulated count is the total of valid substrings.

Code Implementation

def count_substrings_without_repeating_characters(input_string):
    string_length = len(input_string)
    start_window = 0
    end_window = 0
    character_set = set()
    substring_count = 0

    while end_window < string_length:
        current_character = input_string[end_window]

        # If repeat char found, shrink window
        if current_character in character_set:
            while current_character in character_set:
                character_set.remove(input_string[start_window])
                start_window += 1

        character_set.add(current_character)

        # Add to substring count
        substring_count += end_window - start_window + 1
        end_window += 1

    return substring_count

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length n using a sliding window. While the outer loop progresses linearly, the inner while loop (for shrinking the window) might appear to introduce nested iteration. However, each character is added to and removed from the window at most once. Therefore, the shrinking process, in aggregate, takes O(n) time, making the overall time complexity linear with respect to the input size.
Space Complexity
O(1)The sliding window approach described uses a fixed-size data structure, typically a hash map or set, to keep track of characters within the current window. The size of this data structure is bounded by the number of unique characters possible in the input string, which is a constant (e.g., 26 for lowercase English letters, 128 for ASCII). Therefore, the auxiliary space used remains constant regardless of the input string length N, resulting in O(1) space complexity.

Edge Cases

Null or empty input string
How to Handle:
Return 0 immediately as there are no substrings.
Input string with a single character
How to Handle:
Return 1, as the single character is a valid substring.
Input string with all identical characters (e.g., 'aaaa')
How to Handle:
The solution should correctly identify that no substring of length greater than 1 satisfies the condition, and handle single characters correctly.
Input string with maximum possible length (consider memory constraints)
How to Handle:
The solution needs to be efficient enough to avoid exceeding time or memory limits for very long strings, possibly using a sliding window approach with constant space.
Input string with long sequence of unique characters followed by a single repeating character
How to Handle:
The sliding window will expand far then shrink significantly, test the edge of this expansion/contraction.
Input string containing unicode or extended ASCII characters
How to Handle:
Ensure character handling logic correctly supports wider character sets without errors.
Input String with large number of valid substrings
How to Handle:
Check that the return count does not overflow, given the maximum string length.
String with two repeating character 'abcb'
How to Handle:
Sliding window should correctly handle the transition from 'abc' to 'bc', ensuring that all substrings are counted only once