Taro Logo

Largest Substring Between Two Equal Characters

Easy
Asked by:
Profile picture
8 views
Topics:
StringsArrays

Given a string s, return the length of the longest substring between two equal characters, excluding the two characters. If there is no such substring return -1.

A substring is a contiguous sequence of characters within a string.

Example 1:

Input: s = "aa"
Output: 0
Explanation: The optimal substring here is an empty substring between the two 'a's.

Example 2:

Input: s = "abca"
Output: 2
Explanation: The optimal substring here is "bc".

Example 3:

Input: s = "cbzxy"
Output: -1
Explanation: There are no characters that appear twice in s.

Constraints:

  • 1 <= s.length <= 300
  • s contains only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What should I return if the input string is empty or null?
  2. If the same character appears multiple times, should I consider only the first and last occurrences, or the two occurrences that yield the largest substring?
  3. Is the input string guaranteed to contain at least one character, or could it be an empty string?
  4. What is the expected character set for the input string? Is it limited to ASCII lowercase letters, or are other characters possible?
  5. If there are no matching characters in the string, should I return -1 or 0?

Brute Force Solution

Approach

The brute force approach means we'll examine every possible substring of the given string. For each substring, we will check if it meets the criteria to be the largest substring between two identical characters. We will keep track of the largest one found so far and eventually return that.

Here's how the algorithm would work step-by-step:

  1. Look at every possible substring within the whole string.
  2. For each substring, check if the first character of the string is the same as the last character.
  3. If the first and last character match, find the length of the substring between those identical characters.
  4. If they don't match, move on to the next substring.
  5. Compare the length of the current substring to the length of the longest substring found so far.
  6. If the current substring is longer, remember it as the longest substring.
  7. After checking all possible substrings, return the longest one that you found.

Code Implementation

def longest_substring_between_equal_characters_brute_force(input_string):
    longest_substring_length = -1

    for first_index in range(len(input_string)):
        for second_index in range(first_index + 1, len(input_string)):
            # Check for substrings between equal chars
            if input_string[first_index] == input_string[second_index]:

                current_substring_length = second_index - first_index - 1

                # Store largest substring length between
                if current_substring_length > longest_substring_length:

                    longest_substring_length = current_substring_length

    return longest_substring_length

Big(O) Analysis

Time Complexity
O(n³)The algorithm iterates through all possible substrings of the input string. Generating all substrings requires nested loops, where the outer loop iterates from the start of the string (0 to n-1) and the inner loop iterates to the end of the string (i to n-1) giving us O(n²). For each substring generated, it compares the first and last characters, which takes O(1) time, and then calculates the length of the substring if they match, taking another O(n) in the worst case. Thus, this approach has a time complexity of O(n² * n), which simplifies to O(n³).
Space Complexity
O(1)The provided brute force algorithm primarily iterates through substrings using index manipulation and directly compares character values. It stores only a single variable to track the length of the longest substring found so far. Regardless of the input string's length N, the algorithm uses a constant amount of extra space. Therefore, the space complexity is O(1).

Optimal Solution

Approach

To find the largest substring between equal characters quickly, we'll record where we first see each character. Then, as we go through the string, we check if we've seen the character before and, if so, calculate the length between its first and last appearance.

Here's how the algorithm would work step-by-step:

  1. Create a way to remember the first time we see each character in the string.
  2. Go through the string, character by character.
  3. For each character, check if we've seen it before using our memory.
  4. If it's the first time we're seeing it, remember its location.
  5. If we've seen it before, calculate the distance between its first location and its current location.
  6. Keep track of the biggest distance we've found between matching characters.
  7. After checking all characters, the biggest distance we recorded is the answer.

Code Implementation

def largest_substring_between_equal_characters(input_string):
    first_occurrence = {}
    maximum_length = -1

    for index, character in enumerate(input_string):
        # Store the first index at which char appears
        if character not in first_occurrence:
            first_occurrence[character] = index

        # Update max length if necessary
        else:
            substring_length = index - first_occurrence[character] - 1
            maximum_length = max(maximum_length, substring_length)

    # Return result after iterating through the input
    return maximum_length

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length n once to record the first occurrence of each character and then iterates through the string again to calculate the distance between the first and last occurrences. Both of these iterations are independent and proportional to the length of the string. Therefore, the overall time complexity is O(n).
Space Complexity
O(1)The algorithm utilizes a data structure to remember the first occurrence of each character in the string. In the worst case, this structure, typically a hash map or array, will store the index of each unique character. Since the number of unique characters is limited by the character set (e.g., 26 for lowercase English letters), the space used is constant, independent of the input string's length N. Therefore, the auxiliary space complexity is O(1).

Edge Cases

Null or empty input string
How to Handle:
Return -1 immediately as no substring can exist.
String with only one character
How to Handle:
Return -1 because a substring between two equal characters requires at least two characters.
String with two equal characters at the beginning and end (e.g., 'aa')
How to Handle:
The correct length is 0 as the substring between them is empty.
String with no repeating characters
How to Handle:
Return -1 as there's no valid substring.
String with all identical characters (e.g., 'aaaa')
How to Handle:
Return the length of the string minus 2, which would be the length between the first and last character.
String with a large number of characters to consider performance.
How to Handle:
The solution should use a map or array to store first occurrences for O(n) time complexity.
String containing unicode or special characters
How to Handle:
The solution should correctly handle extended ASCII or Unicode characters in the string.
String where the largest substring is at the beginning or end.
How to Handle:
Iterate through the string, keeping track of the max length substring so far, ensuring to catch substrings at both ends.