Taro Logo

Length of the Longest Valid Substring

Hard
Google logo
Google
3 views
Topics:
StringsSliding Windows

You are given a string word and an array of strings forbidden.

A string is called valid if none of its substrings are present in forbidden.

Return the length of the longest valid substring of the string word.

A substring is a contiguous sequence of characters in a string, possibly empty.

Example 1:

Input: word = "cbaaaabc", forbidden = ["aaa","cb"]
Output: 4
Explanation: There are 11 valid substrings in word: "c", "b", "a", "ba", "aa", "bc", "baa", "aab", "ab", "abc" and "aabc". The length of the longest valid substring is 4. 
It can be shown that all other substrings contain either "aaa" or "cb" as a substring.

Example 2:

Input: word = "leetcode", forbidden = ["de","le","e"]
Output: 4
Explanation: There are 11 valid substrings in word: "l", "t", "c", "o", "d", "tc", "co", "od", "tco", "cod", and "tcod". The length of the longest valid substring is 4.
It can be shown that all other substrings contain either "de", "le", or "e" as a substring.

Constraints:

  • 1 <= word.length <= 10^5
  • word consists only of lowercase English letters.
  • 1 <= forbidden.length <= 10^5
  • 1 <= forbidden[i].length <= 10
  • forbidden[i] consists only of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What characters will the input string contain? Will it only be parentheses, or can it contain other characters as well?
  2. What is the maximum length of the input string?
  3. If the input string is empty, what should I return?
  4. By 'valid substring', do you mean a substring where each opening parenthesis has a corresponding closing parenthesis in the correct order?
  5. If there are multiple valid substrings of the same maximum length, should I return any one of them, or is there a specific one I should prefer?

Brute Force Solution

Approach

The brute force method for this problem is like trying every single possible piece of the string to see if it's valid. We check each piece to see if the parentheses are correctly matched, throwing out invalid pieces and remembering the longest valid one we've found so far. It's an exhaustive approach.

Here's how the algorithm would work step-by-step:

  1. First, look at every possible substring, which means every continuous section of the input string.
  2. For each of these substrings, check if it's a 'valid' substring, meaning it has properly nested and matched parentheses.
  3. If a substring is valid, calculate its length.
  4. Compare the length of this valid substring with the length of the longest valid substring found so far.
  5. If the current valid substring is longer, remember it as the new longest valid substring.
  6. After checking all possible substrings, the one you've remembered as the longest is your answer.

Code Implementation

def longest_valid_substring_brute_force(input_string):
    max_length = 0

    for start_index in range(len(input_string)):
        for end_index in range(start_index + 1, len(input_string) + 1):
            substring = input_string[start_index:end_index]

            # Check if substring is valid. Prevents unnecessary length checks.
            if is_valid_parentheses(substring):

                substring_length = len(substring)

                #  Update max_length if current substring is longer
                if substring_length > max_length:
                    max_length = substring_length

    return max_length

def is_valid_parentheses(substring):
    stack = []

    for char in substring:
        if char == '(': 
            stack.append(char)
        elif char == ')':
            # Check for mismatched parenthesis and invalid substrings
            if not stack:
                return False
            stack.pop()

    # Valid only if stack is empty, meaning all parens are matched
    return not stack

Big(O) Analysis

Time Complexity
O(n³)The algorithm iterates through all possible substrings of the input string of length n. Generating all substrings requires nested loops, resulting in approximately n * n/2 substrings. For each substring, it checks if the substring is a valid parenthesis sequence, which takes O(n) time in the worst case as it iterates through the substring. Therefore, the overall time complexity is O(n * n * n), which simplifies to O(n³).
Space Complexity
O(1)The brute force approach, as described, doesn't explicitly create any auxiliary data structures like arrays, hashmaps, or stacks to store intermediate results related to parentheses matching. It only involves iterating through substrings and checking their validity, potentially using a few variables to keep track of the current substring's state or the length of the longest valid substring found so far. Thus, the space used is independent of the input string length N, resulting in constant space complexity.

Optimal Solution

Approach

The most efficient way to find the longest valid substring involves tracking the balance of opening and closing parentheses. We can utilize this balance to detect and correct imbalances, marking potential start and end points of valid substrings. This approach avoids unnecessary checks by focusing on the core property of balanced parentheses.

Here's how the algorithm would work step-by-step:

  1. Keep a running count of how many opening parentheses we've seen but haven't yet closed.
  2. Go through the string from left to right. Each time you see an opening parenthesis, increase the count. Each time you see a closing parenthesis, decrease the count.
  3. If the count ever goes negative, it means there are more closing parentheses than opening parentheses, so reset the count to zero and mark the character after the closing parenthesis as the potential start of a valid substring.
  4. While going from left to right, also keep track of the maximum length valid substring seen so far.
  5. Now, repeat the process, but go through the string from right to left. This time, count the excess of closing parentheses that haven't been opened.
  6. If this new count ever goes negative (meaning more opening than closing), reset the count and remember that the character *before* the current parenthesis might be where a valid substring *ends*.
  7. Again keep track of the maximum length valid substring seen during this right-to-left scan.
  8. The largest length recorded from either the left-to-right or right-to-left scan is the length of the longest valid substring.

Code Implementation

def longest_valid_parentheses(input_string):
    max_length = 0
    open_parentheses_count = 0
    start_index = 0

    for index in range(len(input_string)): 
        if input_string[index] == '(': 
            open_parentheses_count += 1
        else:
            open_parentheses_count -= 1

        if open_parentheses_count < 0: 
            # Reset when closing parens exceed opening ones.
            start_index = index + 1
            open_parentheses_count = 0
        elif open_parentheses_count == 0:
            max_length = max(max_length, index - start_index + 1)
        else:
            max_length = max(max_length, index - start_index + 1)

    close_parentheses_count = 0
    end_index = len(input_string) - 1

    for index in range(len(input_string) - 1, -1, -1): 
        if input_string[index] == ')':
            close_parentheses_count += 1
        else:
            close_parentheses_count -= 1

        if close_parentheses_count < 0:
            # Reset when opening parens exceed closing ones.
            end_index = index - 1
            close_parentheses_count = 0
        elif close_parentheses_count == 0:
            max_length = max(max_length, end_index - index + 1)

        else:
            max_length = max(max_length, end_index - index + 1)

    return max_length

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length n twice: once from left to right and once from right to left. In each iteration, it performs a constant amount of work (incrementing/decrementing counters, comparing values, and updating the maximum length). Therefore, the time complexity is directly proportional to the length of the string, resulting in O(n).
Space Complexity
O(1)The algorithm uses a constant number of variables, specifically counters for tracking the balance of parentheses during both the left-to-right and right-to-left traversals. No auxiliary data structures that scale with the input size N are used. Therefore, the space complexity remains constant regardless of the length of the input string.

Edge Cases

CaseHow to Handle
Null or empty string inputReturn 0 as there are no valid substrings.
Input string of length 1Return 0 because a valid substring requires at least two characters.
String with only opening parenthesesReturn 0 as there are no closing parentheses to form a valid substring.
String with only closing parenthesesReturn 0 as there are no opening parentheses to form a valid substring.
Maximum string length exceeding available memoryThe stack-based approach efficiently handles large inputs under memory constraints, assuming reasonable system limits.
Nested valid substrings e.g. '(()())'Stack approach naturally handles nested valid substrings correctly.
Unbalanced parenthesis at the beginning of the stringStack approach tracks the base and corrects the length.
String with characters other than '(' and ')'The algorithm only considers '(' and ')', effectively ignoring other characters.