Taro Logo

Swap For Longest Repeated Character Substring

Medium
Nutanix logo
Nutanix
1 view
Topics:
ArraysStringsTwo PointersSliding Windows

You are given a string text. You can swap two of the characters in the text.

Return the length of the longest substring with repeated characters.

Example 1:

Input: text = "ababa"
Output: 3
Explanation: We can swap the first 'b' with the last 'a', or the last 'b' with the first 'a'. Then, the longest repeated character substring is "aaa" with length 3.

Example 2:

Input: text = "aaabaaa"
Output: 6
Explanation: Swap 'b' with the last 'a' (or the first 'a'), and we get longest repeated character substring "aaaaaa" with length 6.

Example 3:

Input: text = "aaaaa"
Output: 5
Explanation: No need to swap, longest repeated character substring is "aaaaa" with length is 5.

Constraints:

  • 1 <= text.length <= 2 * 104
  • text consist of lowercase English characters only.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string `s`? Is there a defined upper bound?
  2. Can the input string `s` be empty or null?
  3. If there are multiple repeated character substrings of the same maximum length after a swap, should I return the length of any one of them?
  4. Does the swap have to involve characters adjacent to the existing substring, or can I swap any two characters in the string?
  5. Is the input string guaranteed to only contain lowercase/uppercase alphabets, or can it contain other characters as well?

Brute Force Solution

Approach

The brute force strategy aims to find the longest possible sequence of the same character by trying every possible swap. It explores every single position in the string and sees what happens when that character is swapped with another one to find the longest sequence.

Here's how the algorithm would work step-by-step:

  1. Look at the first character in the string.
  2. Consider swapping this character with every other character in the string, one at a time.
  3. After each swap, check how long the longest sequence of identical characters is.
  4. Remember the longest sequence you have found so far.
  5. Now, move to the next character in the string and repeat the swapping and checking process.
  6. Keep doing this for every character in the string, always looking for a potential swap that creates an even longer sequence of the same character.
  7. At the end, the longest sequence that was recorded during all the swapping and checking is the answer.

Code Implementation

def swap_for_longest_repeated_character_substring(input_string):
    longest_length = 1
    string_length = len(input_string)

    for first_index in range(string_length):
        for second_index in range(string_length):
            # Create a copy to avoid modifying the original string
            modified_string = list(input_string)

            #Perform the swap
            modified_string[first_index], modified_string[second_index] = \
                modified_string[second_index], modified_string[first_index]

            # Find the longest repeating character substring
            current_length = 1
            max_length_after_swap = 1

            for index in range(1, string_length):
                if modified_string[index] == modified_string[index - 1]:
                    current_length += 1
                    max_length_after_swap = max(max_length_after_swap, current_length)
                else:
                    current_length = 1

            # Update the overall longest length
            longest_length = max(longest_length, max_length_after_swap)

    return longest_length

Big(O) Analysis

Time Complexity
O(n³)The outer loop iterates through each of the n characters in the string. For each of these n characters, the inner loop swaps it with every other character in the string, resulting in n swaps. After each swap, the algorithm scans the entire string of length n to find the longest sequence of repeated characters. Thus, for each of the n characters, we perform n swaps and for each swap, we perform an O(n) operation, leading to a time complexity of O(n * n * n), which simplifies to O(n³).
Space Complexity
O(1)The described brute force strategy primarily uses a few variables for indexing and tracking the longest sequence length. It doesn't create any auxiliary data structures like arrays, lists, or hash maps that scale with the input string size N. Therefore, the extra space used remains constant regardless of the input string length, resulting in O(1) space complexity.

Optimal Solution

Approach

To find the longest streak of a single character after making just one swap, we need to efficiently check how swapping letters changes things. The key idea is to focus on counting streaks of the same character and seeing how a single swap could potentially lengthen one of those streaks.

Here's how the algorithm would work step-by-step:

  1. First, count how many times each letter appears in the word.
  2. Now, go through the word and look for streaks of the same letter.
  3. For each streak, imagine swapping a different letter into its place. Check if that new letter is somewhere else in the word.
  4. If you find that letter, see how many more of that letter you can add to the streak by swapping in the letter. It might connect to another streak of the same letter that's nearby!
  5. Keep track of the longest streak you find by doing these swaps.
  6. Make sure that you don't count more of a specific letter than actually exists in the original word.
  7. The largest streak you find is the answer.

Code Implementation

def find_longest_repeated_character_substring(word):
    word_length = len(word)
    if word_length <= 1:
        return word_length

    character_counts = {}
    for character in word:
        character_counts[character] = character_counts.get(character, 0) + 1

    longest_streak = 1

    for i in range(word_length):
        current_character = word[i]
        current_streak = 1
        
        # Count consecutive occurrences of the current character
        while i + 1 < word_length and word[i + 1] == current_character:
            current_streak += 1
            i += 1

        for potential_swap_character in character_counts:
            if potential_swap_character == current_character:
                continue

            remaining_potential_swap_characters = character_counts[potential_swap_character]

            # Calculate the maximum possible streak after a swap
            maximum_possible_streak = current_streak

            for j in range(word_length):
                if word[j] == potential_swap_character:
                    maximum_possible_streak += 1
                    break

            #Find the next streak to connect to
            right = i + 1
            while right < word_length and word[right] == current_character:
                maximum_possible_streak += 1
                right += 1
            
            # We must never exceed total count for a specific character
            longest_streak = max(longest_streak, min(maximum_possible_streak, character_counts[current_character] + character_counts[potential_swap_character]))

    return longest_streak

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through the input string of length n to identify streaks of characters (outer loop). For each streak, it considers swapping every other possible character into that position. In the worst case, for each position in the string, it may need to scan the string again to find matching characters to potentially extend the streak. This nested iteration leads to a time complexity of approximately n * n. Thus, the overall time complexity is O(n²).
Space Complexity
O(1)The algorithm primarily uses a fixed number of variables to store counts and track the longest streak found so far. Step 1 mentions counting letter occurrences, which can be done using a fixed-size array or hash map (at most 26 elements for lowercase English letters) regardless of the input string length N. There are no data structures whose memory usage scales with the input size N, thus the auxiliary space remains constant. The space complexity is therefore O(1).

Edge Cases

CaseHow to Handle
Null or empty input stringReturn 0 since there's no substring to analyze and swapping is impossible.
String with only one characterReturn 1, as the entire string is already a repeated character substring.
String where all characters are identical (e.g., 'aaaa')Return the length of the string because no swap is needed and the whole string is the longest repeated character substring.
String with no possible swap to increase the substring length (e.g., 'abc')Find the longest substring of a single character.
Very long string (potential for performance issues)Ensure the solution uses an algorithm with reasonable time complexity (e.g., O(n) or O(n log n)) to avoid timeouts.
String with multiple potential swap locations yielding same maximal lengthThe solution should correctly identify at least one of those maximal lengths (it doesn't need to find all such positions).
String where swap creates a longer substring elsewhereThe algorithm must consider swapping a character to extend an existing substring, even if the swapped character wasn't originally adjacent to it.
String with mixed case charactersEnsure that the comparison between characters is case-sensitive or case-insensitive based on the problem constraints (clarify with interviewer if unspecified).