Taro Logo

Find Longest Special Substring That Occurs Thrice I

Medium
Google logo
Google
Topics:
Strings

You are given a string s that consists of lowercase English letters.

A string is called special if it is made up of only a single character. For example, the string "abc" is not special, whereas the strings "ddd", "zz", and "f" are special.

Return the length of the longest special substring of s which occurs at least thrice, or -1 if no special substring occurs at least thrice.

A substring is a contiguous non-empty sequence of characters within a string.

Example 1:

Input: s = "aaaa"
Output: 2
Explanation: The longest special substring which occurs thrice is "aa": substrings "**aa**aa", "a**aa**a", and "aa**aa**".
It can be shown that the maximum length achievable is 2.

Example 2:

Input: s = "abcdef"
Output: -1
Explanation: There exists no special substring which occurs at least thrice. Hence return -1.

Example 3:

Input: s = "abcaba"
Output: 1
Explanation: The longest special substring which occurs thrice is "a": substrings "**a**bcaba", "abc**a**ba", and "abcab**a**".
It can be shown that the maximum length achievable is 1.

Constraints:

  • 3 <= s.length <= 50
  • s consists of only lowercase English letters.

Solution


Naive Approach

The most straightforward approach is to iterate through all possible substring lengths and starting positions. For each substring, check if it's special (i.e., made up of a single character) and count its occurrences in the string. If a special substring occurs at least three times, we update our maximum length. If we don't find a substring that satisfies these conditions, we return -1.

Algorithm:

  1. Initialize max_len to -1.
  2. Iterate through all possible lengths length from 1 to len(s).
  3. Iterate through all possible characters char ('a' to 'z').
  4. Create a special substring sub of length length using the character char.
  5. Count the occurrences of sub in s.
  6. If the count is greater than or equal to 3, update max_len with length.
  7. Return max_len.

Python Code:

def longest_special_substring_naive(s):
    max_len = -1
    n = len(s)
    for length in range(1, n + 1):
        for char_code in range(ord('a'), ord('z') + 1):
            char = chr(char_code)
            sub = char * length
            count = 0
            for i in range(n - length + 1):
                if s[i:i+length] == sub:
                    count += 1
            if count >= 3:
                max_len = max(max_len, length)
    return max_len

Time Complexity:

O(n3). The outer loops iterate up to n and 26 respectively. The inner loop s[i:i+length] takes O(n) since we are checking every starting location in s.

Space Complexity:

O(1). We are using a constant amount of extra space.

Optimal Approach

We can optimize by recognizing that once we find a special substring of a certain length occurring thrice, we don't need to check shorter lengths for the same character, as shorter substrings will inherently occur more frequently if the longer one does. We also can avoid creating substrings and instead just checking char by char to see how long of a special string we can build.

Algorithm:

  1. Initialize max_len to -1.
  2. Iterate through the string s.
  3. For each character s[i], find the length of the consecutive sequence of the same character starting from s[i].
  4. Check if a substring of length l made up of s[i] occurs at least thrice in s.
  5. If yes, update max_len with l.
  6. Return max_len

Edge Cases:

  • Empty String (should not occur based on constraints, but handle if needed).
  • String with length less than 3 (return -1 if no special substring occurs thrice).
  • String with no repeating characters (return -1).

Python Code:

def longest_special_substring(s):
    n = len(s)
    max_len = -1

    for i in range(n):
        j = i
        while j < n and s[i] == s[j]:
            j += 1
        length = j - i
        
        count = 0
        for k in range(n - length + 1):
            if s[k:k+length] == s[i] * length:
                count += 1

        if count >= 3:
            max_len = max(max_len, length)
        
        i = j - 1 # Skip already processed characters

    return max_len

Time Complexity:

O(n2) - The nested loops iterate through the string to find consecutive characters and count substring occurrences. The outer loop runs at most n times. The inner loop determining length is O(n) in the worst case (e.g., "aaaaaa"). The count occurrence loop is O(n) so it simplifies down to O(n2).

Space Complexity:

O(1). We are using a constant amount of extra space.