Taro Logo

Remove Invalid Parentheses

Hard
Amazon logo
Amazon
4 views

Given a string s that contains parentheses and letters, remove the minimum number of invalid parentheses to make the input string valid.

Return a list of unique strings that are valid with the minimum number of removals. You may return the answer in any order.

Example 1:

Input: s = "()())" 
Output: ["()()"]

Example 2:

Input: s = "(a)())"
Output: ["(a)()[

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What characters other than '(' and ')' can appear in the input string?
  2. If the input string is already valid, should I return the original string?
  3. Is the output order of the valid strings important? Can I return them in any order?
  4. If no valid strings can be obtained by removing parentheses, what should I return? An empty list?
  5. Are there any constraints on the length of the input string?

Brute Force Solution

Approach

The brute force approach to removing invalid parentheses is like trying every single possible combination of removing parentheses from the input string. We generate all possible strings by removing different sets of parentheses and then check if each resulting string is valid.

Here's how the algorithm would work step-by-step:

  1. First, generate every single possible string that can be created by removing zero or more parentheses from the original string. Think of it like making every combination of erasures.
  2. For each of the generated strings, check if it's a valid arrangement of parentheses, meaning every opening parenthesis has a matching closing one, and they are in the correct order.
  3. Keep only the valid strings, discarding any that have unmatched or improperly ordered parentheses.
  4. Among the valid strings, find the ones that have the maximum length. This is because we want to remove the fewest number of parentheses possible.
  5. If there are multiple valid strings with the maximum length, include all of them as possible solutions. Removing a different set of parentheses might result in multiple valid strings of the same longest length.
  6. Return the set of longest valid strings that we found.

Code Implementation

def remove_invalid_parentheses_brute_force(input_string):
    all_possible_strings = set()
    def generate_possible_strings(current_string, index, string_so_far):
        if index == len(current_string):
            all_possible_strings.add(string_so_far)
            return

        # Option 1: Include the current character
        generate_possible_strings(current_string, index + 1, string_so_far + current_string[index])
        
        # Option 2: Exclude the current character. Necessary to check all combinations.
        generate_possible_strings(current_string, index + 1, string_so_far)

    generate_possible_strings(input_string, 0, "")

    def is_valid(string_to_check):
        balance = 0
        for char in string_to_check:
            if char == '(': 
                balance += 1
            elif char == ')':
                balance -= 1
            if balance < 0:
                return False
        return balance == 0

    valid_strings = []
    for possible_string in all_possible_strings:
        if is_valid(possible_string):
            valid_strings.append(possible_string)

    if not valid_strings:
        return [""]

    max_length = max(len(valid_string) for valid_string in valid_strings)

    # Only keep the valid strings with maximum length.
    result = [valid_string for valid_string in valid_strings if len(valid_string) == max_length]
    
    return result

Big(O) Analysis

Time Complexity
O(2^n * n)Generating all possible substrings by removing parentheses takes O(2^n) time because each parenthesis can either be present or absent in a substring. For each of these 2^n substrings, we need to validate if it is a valid parentheses string, which takes O(n) time, where n is the length of the input string. Additionally, filtering for the maximum length valid strings and storing the results can be considered negligible compared to the cost of generating and validating substrings. Therefore, the overall time complexity is O(2^n * n).
Space Complexity
O(2^N)The brute force approach generates all possible strings by removing parentheses, which leads to 2^N possible combinations (where N is the length of the input string). These generated strings are temporarily stored, resulting in an auxiliary space of O(2^N). Additionally, storing the valid strings of maximum length can require space proportional to the number of such strings, which in the worst case, can also approach O(2^N). The function call stack during the recursive generation of all possible strings contributes O(N) space, but it is dominated by the storage of generated strings.

Optimal Solution

Approach

The goal is to find all possible strings with the fewest invalid parentheses by removing as few as possible. Instead of trying every single combination, we'll use a focused search to efficiently explore potential valid strings.

Here's how the algorithm would work step-by-step:

  1. First, count how many extra left and right parentheses need to be removed to make the string valid.
  2. Then, use a special search method (like a map or similar) to explore different paths, removing one parenthesis at a time.
  3. When removing parentheses, be smart. Only remove a left parenthesis if there are extra left parentheses to remove, and similarly for right parentheses.
  4. When you find a string where you've removed the exact number of extra left and right parentheses needed, check if it is a valid string.
  5. If the string is valid and its length is the longest we've seen so far among valid strings, add it to the list of results.
  6. If the string is valid and has the same length as the longest we have seen, also add it to the results.
  7. Continue exploring paths until you have found all valid strings that require the fewest removals. Use the search strategy to prevent unnecessary exploration.

Code Implementation

def remove_invalid_parentheses(input_string):
    def is_valid(string_to_check):
        balance = 0
        for char in string_to_check:
            if char == '(': 
                balance += 1
            elif char == ')':
                balance -= 1
            if balance < 0:
                return False
        return balance == 0

    def calculate_removals(input_string):
        left_removals_needed = 0
        right_removals_needed = 0
        for char in input_string:
            if char == '(': 
                left_removals_needed += 1
            elif char == ')':
                if left_removals_needed > 0:
                    left_removals_needed -= 1
                else:
                    right_removals_needed += 1
        return left_removals_needed, right_removals_needed

    left_removals, right_removals = calculate_removals(input_string)
    queue = {input_string}
    results = set()
    longest_length = -1

    while queue:
        current_string = queue.pop()

        if is_valid(current_string):
            #Add valid strings to the result
            if len(current_string) > longest_length:
                results = {current_string}
                longest_length = len(current_string)
            elif len(current_string) == longest_length:
                results.add(current_string)

        if len(current_string) < longest_length:
            continue

        #Only explore if necessary removals remain
        if left_removals == 0 and right_removals == 0:
            continue

        for index, char in enumerate(current_string):
            if char == '(' or char == ')':
                #Avoid redundant removals
                next_string = current_string[:index] + current_string[index+1:]

                if next_string not in queue:
                    queue.add(next_string)

    return list(results)

Big(O) Analysis

Time Complexity
O(n * 2^n)The algorithm explores different possible strings by removing parentheses. In the worst case, for a string of length n, we could potentially remove any combination of parentheses, leading to 2^n possible removals. For each of these potential removals, we might need to spend O(n) time checking if the resulting string is valid and updating our results. Therefore, the overall time complexity is approximately O(n * 2^n), reflecting the exponential nature of exploring combinations and the linear cost of validation.
Space Complexity
O(N)The space complexity is dominated by the queue or set used in the search method (like BFS) described in step 2. In the worst-case scenario, especially if the input string 's' is mostly invalid, the queue/set might hold a significant number of intermediate strings of varying lengths, potentially up to a size proportional to the length of the input string N. This occurs because, at each level of the search, we're generating and storing multiple modified strings derived from the original string by removing parentheses. Therefore, the auxiliary space is proportional to N, resulting in O(N) space complexity. The recursion stack depth is also bounded by N, contributing O(N) space as well, however, the explicit queue/set is typically the larger factor.

Edge Cases

CaseHow to Handle
Empty string inputReturn an empty list as there are no parentheses to remove.
Input string with no parenthesesReturn the input string as it is already valid.
Input string with only opening parenthesesRemove all opening parentheses to get an empty valid string.
Input string with only closing parenthesesRemove all closing parentheses to get an empty valid string.
Input string with deeply nested valid parenthesesThe solution should handle nested structures without stack overflow, often achieved by iterative methods instead of recursion.
Input string with many invalid parentheses clustered togetherThe algorithm should efficiently explore different removal combinations to find the valid strings.
Input string with characters other than '(' and ')'The solution should only consider parentheses characters for validity checks, ignoring other characters.
Large input string that could lead to a large number of valid solutionsConsider using a set to avoid duplicate results and potentially improve performance by pruning the search space.