Given a string s
and an integer k
, return the length of the longest substring of s
such that the frequency of each character in this substring is greater than or equal to k
. If no such substring exists, return 0.
For example:
s = "aaabb"
and k = 3
, the output should be 3 because the longest substring is "aaa", where 'a' is repeated 3 times.s = "ababbc"
and k = 2
, the output should be 5 because the longest substring is "ababb", where 'a' is repeated 2 times and 'b' is repeated 3 times.Write a function to efficiently solve this problem. Consider edge cases and optimize for both time and space complexity. The length of s
will be between 1 and 10,000, and k
will be between 1 and 100,000. The string s
consists of only lowercase English letters.
The most straightforward approach is to generate all possible substrings of the given string s
and then, for each substring, check if the frequency of each character in the substring is greater than or equal to k
. The longest substring that satisfies this condition is the answer.
s
.k
, the substring is invalid.k
, update the maximum length.def longestSubstringNaive(s: str, k: int) -> int:
n = len(s)
max_len = 0
for i in range(n):
for j in range(i, n):
sub = s[i:j+1]
freq = {}
for char in sub:
freq[char] = freq.get(char, 0) + 1
valid = True
for char in freq:
if freq[char] < k:
valid = False
break
if valid:
max_len = max(max_len, len(sub))
return max_len
The time complexity is O(N^3) where N is the length of the string s
. Generating all substrings takes O(N^2) time, and checking the frequency of characters in each substring takes O(N) time in the worst case.
The space complexity is O(N), where N is the length of the string s
, due to the space used by the frequency map.
The optimal solution employs a divide and conquer strategy. The idea is to find a character in the string whose frequency is less than k
. If such a character exists, it means that this character cannot be part of any valid substring. Therefore, we can split the string into substrings based on this character and recursively solve the problem for each substring.
s
.c
whose frequency is less than k
.s
.c
exists, split the string s
into substrings using c
as a delimiter.def longestSubstring(s: str, k: int) -> int:
n = len(s)
if n == 0:
return 0
freq = {}
for char in s:
freq[char] = freq.get(char, 0) + 1
for char in freq:
if freq[char] < k:
max_len = 0
for sub in s.split(char):
max_len = max(max_len, longestSubstring(sub, k))
return max_len
return n
The time complexity is O(N), where N is the length of the string s
. In the worst case, the string may be split into many substrings, but each character is only visited a limited number of times, keeping the overall complexity linear. Splitting the string contributes a factor, but it doesn't alter the overall linear time complexity.
The space complexity is O(N), where N is the length of the string s
, due to the recursion depth. In the worst-case scenario, the recursion depth can be proportional to the length of the string if the string is split into very small substrings recursively. Additionally, the frequency map takes O(1) space since there are only 26 lowercase English letters, so it's a constant factor. The space usage for s.split
creates new strings but are short-lived and don't affect the overall asymptotic space complexity.
s
is empty, the function should return 0.k
is greater than the length of s
, and if any character exists, the result will be 0 as any character will have a length less than k
.k
times, the whole string s
is a valid substring, so the function should return the length of s
.