Given a string s
consisting only of characters a, b and c.
Return the number of substrings containing at least one occurrence of all these characters a, b and c.
For example:
s = "abcabc"
should return 10
. The substrings are "abc", "abca", "abcab", "abcabc", "bca", "bcab", "bcabc", "cab", "cabc" and "abc".s = "aaacb"
should return 3
. The substrings are "aaacb", "aacb" and "acb".s = "abc"
should return 1
.Can you come up with an efficient solution?
A brute force solution would involve generating all possible substrings of the given string s
and then checking each substring to see if it contains at least one occurrence of each of the characters 'a', 'b', and 'c'.
def count_substrings_naive(s):
n = len(s)
count = 0
for i in range(n):
for j in range(i, n):
sub = s[i:j+1]
if 'a' in sub and 'b' in sub and 'c' in sub:
count += 1
return count
The outer loops iterate n
times each, resulting in O(n^2). The check for 'a', 'b', and 'c' within the inner loop takes O(n) in the worst case, giving a total complexity of O(n^3).
The space complexity is O(1) as we are only using a constant amount of extra space.
A more efficient approach utilizes the sliding window technique.
left
and right
, to 0.right
pointer until the window [left, right]
contains at least one 'a', 'b', and 'c'.n - right
. This is because all substrings starting from left
and ending at or after right
will also be valid.left
pointer to the right until the window is no longer valid (i.e., it doesn't contain at least one 'a', 'b', and 'c').right
pointer reaches the end of the string.def count_substrings(s):
n = len(s)
count = 0
left = 0
freq = {'a': 0, 'b': 0, 'c': 0}
required = 3
formed = 0
for right in range(n):
char = s[right]
freq[char] += 1
if freq[char] == 1:
formed += 1
while formed == required:
count += n - right
char = s[left]
freq[char] -= 1
if freq[char] == 0:
formed -= 1
left += 1
return count
The right
pointer iterates through the string once. The left
pointer may iterate multiple times, but in total, it will not iterate more than n
times. Therefore, the time complexity is O(n).
We use a dictionary freq
to store the frequencies of the characters 'a', 'b', and 'c'. The size of this dictionary is constant regardless of the size of the input string. Thus, the space complexity is O(1).