Taro Logo

Longest Substring with At Least K Repeating Characters

Medium
4 views
2 months ago

Given a string s and an integer k, return the length of the longest substring of s such that the frequency of each character in this substring is greater than or equal to k.

if no such substring exists, return 0.

Example 1:

Input: s = "aaabb", k = 3
Output: 3
Explanation: The longest substring is "aaa", as 'a' is repeated 3 times.

Example 2:

Input: s = "ababbc", k = 2
Output: 5
Explanation: The longest substring is "ababb", as 'a' is repeated 2 times and 'b' is repeated 3 times.

Constraints:

  • 1 <= s.length <= 10^4
  • s consists of only lowercase English letters.
  • 1 <= k <= 10^5
Sample Answer
## Longest Substring with at Least K Repeating Characters

### Problem Description

Given a string `s` and an integer `k`, find the length of the longest substring of `s` such that the frequency of each character in this substring is greater than or equal to `k`. If no such substring exists, return 0.

**Example 1:**

Input: s = "aaabb", k = 3 Output: 3 Explanation: The longest substring is "aaa", as 'a' is repeated 3 times.


**Example 2:**

Input: s = "ababbc", k = 2 Output: 5 Explanation: The longest substring is "ababb", as 'a' is repeated 2 times and 'b' is repeated 3 times.


### Brute Force Solution

The most straightforward approach is to generate all possible substrings of the given string and check if each substring satisfies the condition that each character appears at least `k` times. This approach is inefficient but helps to understand the problem better.

```python
def longest_substring_brute_force(s: str, k: int) -> int:
    n = len(s)
    max_len = 0
    
    for i in range(n):
        for j in range(i, n):
            sub = s[i:j+1]
            counts = {}
            valid = True
            
            for char in sub:
                counts[char] = counts.get(char, 0) + 1
            
            for char in counts:
                if counts[char] < k:
                    valid = False
                    break
            
            if valid:
                max_len = max(max_len, len(sub))
                
    return max_len

Optimal Solution (Divide and Conquer)

The optimal solution uses a divide and conquer approach. The main idea is to find characters in the string that appear less than k times. If such characters exist, they cannot be part of any valid substring. Therefore, we can split the string at these characters and recursively solve the problem for the substrings.

def longest_substring(s: str, k: int) -> int:
    n = len(s)
    if k <= 1:
        return n
    
    counts = {}
    for char in s:
        counts[char] = counts.get(char, 0) + 1
    
    for i in range(n):
        if counts[s[i]] < k:
            left = longest_substring(s[:i], k)
            right = longest_substring(s[i+1:], k)
            return max(left, right)
    
    return n

Big(O) Run-time Analysis

The optimal solution uses a divide and conquer strategy. Let's analyze the time complexity:

  • Frequency Counting: The code first iterates through the string s to count the frequency of each character. This takes O(n) time, where n is the length of the string.
  • Divide and Conquer: In the worst-case scenario, the string is divided into single-character substrings, and the function is called recursively for each substring. However, each character is only checked once to see if it occurs less than k times, so this division process takes O(n) in total.
  • Recursion: The recursion depth is limited by the number of distinct characters in the string. In the worst case, the recursion depth can be O(n). However, on average, it would be smaller since we split on characters that appear less than k times.

Therefore, the overall time complexity is O(n), where n is the length of the string, as each character is visited at most a constant number of times.

Big(O) Space Usage Analysis

Let's analyze the space complexity of the optimal solution:

  • Frequency Counting: The counts dictionary stores the frequency of each character in the string. In the worst case, all characters are distinct, so the space required would be O(1), given there can only be 26 lowercase English letters.
  • Recursion: The recursive calls create a call stack. In the worst-case scenario, where the string is divided into many small substrings, the maximum depth of the call stack can be O(n). However, since the input string s gets smaller at each recursive call, the maximum memory used is at most the input's length.

Therefore, the overall space complexity is O(n) due to the recursion depth in the worst-case scenario.

Edge Cases

  1. Empty String: If the input string s is empty, the function should return 0 because there are no substrings.
  2. k = 1: If k is 1, any character will satisfy the condition, so the function should return the length of the entire string.
  3. String with characters all >= k: If all characters in the string appear at least k times, return length of string
  4. k > length of string: If k is larger than string, then return 0, because there is no way any character will appear k times.

Here is the code that handles the first two edge cases as well:

def longest_substring(s: str, k: int) -> int:
    n = len(s)
    if k <= 1:
        return n
    
    counts = {}
    for char in s:
        counts[char] = counts.get(char, 0) + 1
    
    for i in range(n):
        if counts[s[i]] < k:
            left = longest_substring(s[:i], k)
            right = longest_substring(s[i+1:], k)
            return max(left, right)
    
    return n