Taro Logo

Substrings of Size Three with Distinct Characters

Easy
Quora logo
Quora
0 views
Topics:
StringsSliding WindowsArraysTwo Pointers

A string is good if there are no repeated characters.

Given a string s​​​​​, return the number of good substrings of length three in s​​​​​​.

Note that if there are multiple occurrences of the same substring, every occurrence should be counted.

A substring is a contiguous sequence of characters in a string.

Example 1:

Input: s = "xyzzaz"
Output: 1
Explanation: There are 4 substrings of size 3: "xyz", "yzz", "zza", and "zaz". 
The only good substring of length 3 is "xyz".

Example 2:

Input: s = "aababcabc"
Output: 4
Explanation: There are 7 substrings of size 3: "aab", "aba", "bab", "abc", "bca", "cab", and "abc".
The good substrings are "abc", "bca", "cab", and "abc".

Constraints:

  • 1 <= s.length <= 100
  • s​​​​​​ consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string 's'?
  2. Can the input string 's' contain characters other than lowercase English letters?
  3. Is the empty string a valid input, and if so, what should the return value be?
  4. Are we only looking for contiguous substrings of length 3, or can they be non-contiguous?
  5. If the string length is less than 3, what should be the return value?

Brute Force Solution

Approach

The brute force strategy for this problem is quite simple. We will look at every single possible group of three characters in the given string and check if that group meets the requirements.

Here's how the algorithm would work step-by-step:

  1. Start with the very first three characters of the string.
  2. Check if all three characters in this group are different from each other.
  3. If they are all different, then we have found one valid group.
  4. Move to the next group of three characters by shifting one position to the right.
  5. Check again if all the characters in this new group are different.
  6. Keep repeating this process, sliding the group of three characters one position at a time, until we reach the end of the string.
  7. Count how many groups of three had all different characters. This count is the answer.

Code Implementation

def count_good_substrings(input_string):
    string_length = len(input_string)
    good_substring_count = 0

    # Iterate through the string, checking substrings of length 3
    for index in range(string_length - 2):
        substring = input_string[index:index + 3]

        # Check if the substring has distinct characters

        if (substring[0] != substring[1] and \
                substring[0] != substring[2] and \
                substring[1] != substring[2]):
            good_substring_count += 1

    return good_substring_count

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length n, examining each substring of length 3. For each substring, it checks if the characters are distinct, which takes constant time O(1). The number of substrings to check is proportional to n. Therefore, the overall time complexity is O(n).
Space Complexity
O(1)The provided algorithm iterates through substrings of length three, checking for distinct characters. It doesn't use any auxiliary data structures like arrays, hash maps, or sets to store intermediate results or track visited substrings. It only uses a constant number of variables to keep track of the current substring being examined and the count of valid substrings. Therefore, the space complexity remains constant regardless of the input string's length (N).

Optimal Solution

Approach

Instead of checking every possible three-letter chunk, we can slide a window of size three across the input. We only need to check each window to see if it contains unique characters; if it does, we count it.

Here's how the algorithm would work step-by-step:

  1. Imagine a sliding window, like a frame, that is exactly three letters wide.
  2. Place this window at the very beginning of the given text.
  3. Check if all the letters currently inside the window are different from each other.
  4. If the letters are all different, count this window as a valid substring.
  5. Now, slide the window one letter to the right.
  6. Repeat the process of checking for distinct characters within the window and counting if necessary.
  7. Keep sliding the window and checking until the end of the text is reached, making sure the window doesn't go beyond the text's boundary.
  8. The total count represents how many three-letter substrings with distinct characters are in the text.

Code Implementation

def count_good_substrings(word):
    substring_count = 0
    string_length = len(word)

    # Iterate through the string using a sliding window of size 3
    for i in range(string_length - 2): 
        substring = word[i:i+3]

        # Check if the characters in the substring are distinct
        if len(set(substring)) == 3:
            # Distinct characters found
            substring_count += 1

    return substring_count

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string using a sliding window of size 3. For each window, it checks if the characters within the window are distinct. The sliding window moves from left to right, advancing one character at a time. The number of windows is directly proportional to the length of the string (n). The check for distinct characters within the window is done on a fixed number of characters (3), which is a constant time operation. Therefore, the overall time complexity is O(n).
Space Complexity
O(1)The provided solution uses a sliding window of size three. It only needs to store a constant number of variables, such as counters or flags, to check for distinct characters within the window. The space required for these variables does not depend on the input string's length, denoted as N. Therefore, the auxiliary space remains constant regardless of the input size, resulting in a space complexity of O(1).

Edge Cases

CaseHow to Handle
Null or empty input stringReturn 0 immediately as there are no substrings to evaluate.
Input string with length less than 3Return 0 as a substring of length 3 cannot be formed.
Input string with all identical characters (e.g., 'AAA')The sliding window will never contain distinct characters; the count will remain 0.
Input string with all distinct characters (e.g., 'ABCDEF')Every substring of length 3 will be distinct; the count will equal len(s)-2.
Input string with a mix of distinct and duplicate characters (e.g., 'AABCA')The sliding window correctly identifies and counts only the distinct substrings.
Long input string (scalability considerations)The sliding window approach provides O(n) time complexity, scaling linearly with input size.
Input string with unicode charactersThe solution should handle unicode characters correctly as long as the comparison and character counting mechanism supports them.
String containing only non-alphanumeric charactersThe character uniqueness check should work the same regardless of whether the characters are alphanumeric or special characters.