Taro Logo

Longest Repeating Character Replacement

Medium
Microsoft logo
Microsoft
3 views
Topics:
StringsSliding Windows

You are given a string s and an integer k. You can choose any character of the string and change it to any other uppercase English character. You can perform this operation at most k times.

Return the length of the longest substring containing the same letter you can get after performing the above operations.

Example 1:

Input: s = "ABAB", k = 2
Output: 4
Explanation: Replace the two 'A's with two 'B's or vice versa.

Example 2:

Input: s = "AABABBA", k = 1
Output: 4
Explanation: Replace the one 'A' in the middle with 'B' and form "AABBBBA".
The substring "BBBB" has the longest repeating letters, which is 4.
There may exists other ways to achieve this answer too.

Constraints:

  • 1 <= s.length <= 105
  • s consists of only uppercase English letters.
  • 0 <= k <= s.length

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string `s` and the maximum value for `k`?
  2. Is the input string `s` guaranteed to contain only uppercase English letters, or could it contain other characters?
  3. If the string `s` is empty, what should I return?
  4. If `k` is zero, and there are no repeating characters, what is the expected output? For example, in 'ABC', should it be 1?
  5. If there are multiple substrings with the same maximum length after replacements, can I return the length of any of them?

Brute Force Solution

Approach

We want to find the longest section of a string that has at most a certain number of different characters if we're allowed to change some characters. The brute force way is to check every possible section. We try every section length and starting position until we find the longest one that works.

Here's how the algorithm would work step-by-step:

  1. Consider every possible section of the string, starting with sections of length one.
  2. For each section, count how many changes you'd need to make so all the characters are the same.
  3. If the number of changes needed is less than or equal to the allowed number of changes, remember the length of that section.
  4. Keep track of the longest section you've found so far that meets the change requirement.
  5. Continue checking sections of increasing lengths, one by one.
  6. Once you've looked at all possible sections and their lengths, the longest one you remembered is the answer.

Code Implementation

def longest_repeating_character_replacement_brute_force(input_string, allowed_changes):
    longest_section_length = 0
    string_length = len(input_string)

    for section_length in range(1, string_length + 1):
        for start_index in range(string_length - section_length + 1):
            end_index = start_index + section_length
            substring = input_string[start_index:end_index]

            # Determine the character that appears most in the substring
            most_frequent_character_count = 0
            for char in set(substring):
                most_frequent_character_count = max(
                    most_frequent_character_count,
                    substring.count(char)
                )

            # Checks if changes are within the allowed limit
            changes_needed = section_length - most_frequent_character_count

            if changes_needed <= allowed_changes:
                # Update longest_section_length
                longest_section_length = max(longest_section_length, section_length)

    return longest_section_length

Big(O) Analysis

Time Complexity
O(n³)The algorithm iterates through all possible substrings of the input string of length n. There are approximately n² possible substrings (n choices for the start and then up to n choices for the end). For each substring, it counts the frequency of each character which takes O(n) time in the worst case (to iterate through the substring and update character counts). Therefore, the overall time complexity is O(n² * n), which simplifies to O(n³).
Space Complexity
O(1)The provided brute force solution iterates through all possible substrings, but it does not explicitly create any auxiliary data structures that scale with the input string's length (N). It keeps track of the longest valid substring length found so far and the number of changes needed for the current substring, both of which require constant space. Therefore, the auxiliary space complexity is O(1).

Optimal Solution

Approach

The key to solving this efficiently is to avoid redundant calculations. We'll use a sliding window to keep track of a range of characters and dynamically adjust its size based on the allowed replacements, ensuring we find the longest valid substring.

Here's how the algorithm would work step-by-step:

  1. Imagine a window that can slide across the string, looking at different sections.
  2. Expand the window to the right, including more characters, as long as the number of characters that need to be replaced to make the section all the same doesn't exceed our replacement limit.
  3. Keep track of the most frequent character within the current window. This helps us quickly determine how many replacements we need.
  4. If the number of replacements needed exceeds the limit, shrink the window from the left, removing characters until it becomes valid again.
  5. While sliding and adjusting the window, always remember the length of the longest valid window encountered so far.
  6. After going through the whole string, the longest valid window length will be your answer.

Code Implementation

def character_replacement(
    input_string: str, allowed_replacements: int
) -> int:
    window_start = 0
    max_length = 0
    most_frequent_character_count = 0
    character_frequency = {}

    for window_end in range(len(input_string)):
        right_character = input_string[window_end]

        if right_character not in character_frequency:
            character_frequency[right_character] = 0
        character_frequency[right_character] += 1

        most_frequent_character_count = max(
            most_frequent_character_count,
            character_frequency[right_character],
        )

        # Shrink the window if replacements exceed the limit.
        if (
            window_end - window_start + 1
            - most_frequent_character_count
            > allowed_replacements
        ):

            # Reduce the frequency of the outgoing character.
            left_character = input_string[window_start]
            character_frequency[left_character] -= 1

            # Move the window start forward.
            window_start += 1

        # Track the maximum window size.
        max_length = max(max_length, window_end - window_start + 1)

    return max_length

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string using a sliding window approach, where 'n' is the length of the input string. The window expands and contracts, but each character is visited at most a constant number of times (once by the right pointer and potentially once by the left pointer). The character count updates within the window take constant time. Therefore, the overall time complexity is proportional to the length of the string, resulting in O(n).
Space Complexity
O(1)The algorithm described primarily uses a sliding window approach and keeps track of the most frequent character within that window. It requires a fixed-size character frequency count, typically an array or hash map, to determine the most frequent character. Since the character set is limited (e.g., 26 for uppercase English letters), the space used for this frequency count is constant and independent of the input string's length, N. Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Empty string sReturn 0 if the string is empty, as there's no substring to consider.
k = 0 and string s has repeating charactersReturn the length of the longest substring of repeating characters directly without any replacements.
k is greater than or equal to the length of sReturn the length of s, since you can replace all characters to have a uniform string.
String 's' contains only one unique characterReturn the length of s, as no replacements are necessary.
String 's' length is large (approaching maximum string length allowed by the language)Ensure the sliding window approach is efficient in terms of time and space complexity to prevent timeouts or memory errors.
k is a large numberThe algorithm should still work correctly without integer overflow or performance degradation.
String 's' contains a mix of almost equal occurrences of several distinct charactersThe sliding window needs to dynamically adjust size based on 'k' and character frequencies.
String with only two distinct characters and a large value of kThe solution should still be able to find the longest substring consisting of one character after replacements.