Taro Logo

Remove All Adjacent Duplicates in String II

Medium
Grammarly logo
Grammarly
1 view
Topics:
ArraysStacks

You are given a string s and an integer k, a k duplicate removal consists of choosing k adjacent and equal letters from s and removing them, causing the left and the right side of the deleted substring to concatenate together.

We repeatedly make k duplicate removals on s until we no longer can.

Return the final string after all such duplicate removals have been made. It is guaranteed that the answer is unique.

Example 1:

Input: s = "abcd", k = 2
Output: "abcd"
Explanation: There's nothing to delete.

Example 2:

Input: s = "deeedbbcccbdaa", k = 3
Output: "aa"
Explanation: 
First delete "eee" and "ccc", get "ddbbbdaa"
Then delete "bbb", get "dddaa"
Finally delete "ddd", get "aa"

Example 3:

Input: s = "pbbcggttciiippooaais", k = 2
Output: "ps"

Constraints:

  • 1 <= s.length <= 105
  • 2 <= k <= 104
  • s only contains lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for `k`, and what is the maximum length of the string `s`?
  2. Is the input string `s` guaranteed to consist of only lowercase English letters, or can it contain other characters?
  3. If the string becomes empty after removing duplicates, should I return an empty string, or is there any other expected output?
  4. If multiple removals are possible at the same time, does the order of removal matter?
  5. Is `k` always guaranteed to be greater than or equal to 1?

Brute Force Solution

Approach

The brute force approach to removing duplicate characters checks every possible substring of adjacent, identical characters. For each substring, we check if it meets the specified removal count, and if so, remove it. This process is repeated until no more removals can be made.

Here's how the algorithm would work step-by-step:

  1. Go through the string character by character from the start.
  2. For each character, check if it's the same as the characters next to it.
  3. If you find a group of adjacent, identical characters, count how many there are.
  4. If the count of these identical characters is equal to the specified removal count, then remove them from the string.
  5. After removing a group of characters, start again from the beginning of the string to account for new adjacent duplicates that might have formed.
  6. Keep repeating these steps until you can't find any more adjacent groups of identical characters that meet the removal count requirement.

Code Implementation

def remove_duplicates_brute_force(input_string, removal_count):
    string_length = len(input_string)

    while True:
        index = 0
        removal_made = False

        while index < string_length:
            adjacent_duplicates_count = 1
            next_index = index + 1

            # Count adjacent duplicates
            while next_index < string_length and input_string[index] == input_string[next_index]:
                adjacent_duplicates_count += 1
                next_index += 1

            # Check if the number of duplicates matches the removal count
            if adjacent_duplicates_count == removal_count:
                input_string = input_string[:index] + input_string[next_index:]
                string_length = len(input_string)
                removal_made = True

                # Restart from beginning, to check for newly formed adjacent duplicates
                index = 0
                break

            index = next_index

        # Stop iterating if no removals were made during a pass.
        if not removal_made:
            break

    return input_string

Big(O) Analysis

Time Complexity
O(n^2)The described brute force approach iterates through the string of size n and, within the outer loop, identifies and potentially removes adjacent duplicates. After each removal, the algorithm restarts from the beginning of the string. In the worst-case scenario, removals could occur repeatedly, forcing the algorithm to traverse the string multiple times. The repeated traversal and potential removals within the string lead to a time complexity of approximately n * n/2, which is simplified to O(n^2).
Space Complexity
O(1)The provided brute force approach operates directly on the input string and does not use any auxiliary data structures that scale with the input size N (the length of the string). Although the algorithm mentions checking and removing substrings, it implies these operations are performed in-place or through direct manipulation of the original string using methods like substring replacement, which don't inherently require additional space proportional to N. Therefore, the auxiliary space required is constant, independent of the input string length.

Optimal Solution

Approach

The goal is to remove consecutive identical characters if they appear 'k' times in a row. The efficient approach uses a memory (like a stack) to keep track of the characters and their counts, allowing us to process the string in a single pass.

Here's how the algorithm would work step-by-step:

  1. Go through the string character by character.
  2. For each character, check if it's the same as the last character we've seen and are keeping track of.
  3. If it's the same, increase the count for that character.
  4. If it's different, add the new character with a count of one to our tracking memory.
  5. If the count of a character reaches 'k', remove it from our memory, because we need to get rid of those duplicates.
  6. Once we've processed the whole string, build a new string from the characters and counts remaining in our memory.

Code Implementation

def remove_duplicates(input_string, k):
    stack = []

    for character in input_string:
        # Check if stack is non-empty and the top character matches
        if stack and stack[-1][0] == character:

            stack[-1][1] += 1
            if stack[-1][1] == k:
                # Remove if the count reaches k

                stack.pop()

        else:
            # Add new character with count 1
            stack.append([character, 1])

    result = ""
    for character, count in stack:
        result += character * count

    return result

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length 'n' exactly once. In each iteration, it performs a constant number of operations: comparing characters, incrementing/decrementing counts, and potentially adding/removing elements from the stack. Since the number of operations within the loop does not depend on the size of the input string, the overall time complexity is directly proportional to 'n', making it O(n).
Space Complexity
O(N)The algorithm uses a stack (or similar memory) to store characters and their counts. In the worst-case scenario, where there are no consecutive duplicate characters, the stack will store each character of the input string along with its count. Therefore, the auxiliary space used grows linearly with the input size N, where N is the length of the input string. The maximum size of the stack would be proportional to N.

Edge Cases

CaseHow to Handle
Empty string inputReturn an empty string since there are no characters to process.
Null string inputReturn an empty string or throw an IllegalArgumentException, depending on requirements.
k is 1Return an empty string as every character will be removed.
k is greater than the string lengthReturn the original string as no removals are possible.
String with all identical characters and length is a multiple of kReturn an empty string as all characters will be removed.
String with all identical characters and length is not a multiple of kReturn the remaining characters (string length modulo k) repeated.
String contains characters with high Unicode valuesEnsure character comparison and storage handle extended Unicode characters correctly.
String with alternating characters repeated less than k timesThe stack must not incorrectly remove valid characters when intermediate repeating series are processed.