Taro Logo

Longest Substring Without Repeating Characters

Medium
Visa logo
Visa
1 view
Topics:
StringsSliding Windows

Given a string s, find the length of the longest substring without duplicate characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Constraints:

  • 0 <= s.length <= 5 * 104
  • s consists of English letters, digits, symbols and spaces.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the character set of the input string `s` (e.g., ASCII, Unicode)?
  2. Can the input string `s` be empty or null?
  3. If the input string `s` is empty, what should the return value be?
  4. Are we looking for any substring, or specifically a contiguous substring?
  5. If there are multiple substrings with the same maximum length, should I return the length of any of them?

Brute Force Solution

Approach

The brute force method for this problem is simple: try every possible substring. We check each one to see if it has any repeated characters and remember the longest one we find that doesn't.

Here's how the algorithm would work step-by-step:

  1. Look at the very first letter of the big string.
  2. Consider it as a substring by itself. Does it repeat any characters (of course not, it's just one letter)? Great, remember its length.
  3. Now, consider the first letter and the second letter together as a substring. Does this substring have repeated characters? If not, remember its length, but only if it's longer than what we remembered before.
  4. Next, take the first, second, and third letters together as a substring. Check for repeated characters. Update the longest length if needed.
  5. Keep doing this, adding one letter at a time to the substring starting at the first letter, until you reach the end of the big string.
  6. Okay, now start again, but this time, start the substring at the second letter of the big string.
  7. Repeat the same process of adding one letter at a time to the substring and checking for repeats, updating the longest length as you go.
  8. Continue this pattern, moving the starting point of the substring one letter further each time, until the starting point is at the last letter of the big string.
  9. When you've tried all possible substrings starting at every possible position, the longest length you remembered is your answer.

Code Implementation

def longest_substring_without_repeating_characters_brute_force(input_string):
    longest_length = 0
    string_length = len(input_string)

    for start_index in range(string_length):
        for end_index in range(start_index, string_length):
            substring = input_string[start_index:end_index + 1]

            # Check if the current substring has repeating characters
            if len(set(substring)) == len(substring):

                # Update the longest length if needed
                longest_length = max(longest_length, len(substring))

    return longest_length

Big(O) Analysis

Time Complexity
O(n³)The described brute force approach involves iterating through all possible substrings of the input string. For a string of length n, there are approximately n choices for the starting position of a substring. For each starting position, the substring can extend up to n characters. Furthermore, for each substring, we need to check for repeating characters, which can take up to n operations in the worst case (checking each character against every other character in the substring). Thus, the total number of operations is proportional to n * n * n, leading to a time complexity of O(n³).
Space Complexity
O(N)The brute force method, as described, checks every possible substring for repeating characters. To perform this check, we would typically need a data structure, such as a set or a hash map, to store the characters present in the current substring. In the worst-case scenario, the longest substring without repeating characters could be the entire input string itself, meaning the set or hash map would store all N characters from the input string. Therefore, the auxiliary space required is proportional to the input size N, leading to O(N) space complexity.

Optimal Solution

Approach

The core idea is to maintain a window that represents a possible substring. We slide this window across the string, expanding it when we find unique characters and shrinking it when we encounter a repeated character. This way, we only consider valid substrings.

Here's how the algorithm would work step-by-step:

  1. Imagine you have a window that can grow or shrink over the string.
  2. Start with a small window at the beginning of the string.
  3. Expand the window to the right, one character at a time, as long as all the characters within the window are unique.
  4. If you find a character that already exists within the window, shrink the window from the left until that repeating character is no longer in the window.
  5. Keep track of the size of the largest window you've seen so far, which represents the longest substring without repeating characters.
  6. Continue sliding the window until you reach the end of the string.
  7. The largest window size you recorded is the length of the longest substring without repeating characters.

Code Implementation

def length_of_longest_substring(input_string):
    start_index = 0
    end_index = 0
    maximum_length = 0
    character_index_map = {}

    while end_index < len(input_string):
        current_character = input_string[end_index]

        # If char exists and within window
        if current_character in character_index_map and character_index_map[current_character] >= start_index:

            start_index = character_index_map[current_character] + 1

        character_index_map[current_character] = end_index

        # Update max length
        maximum_length = max(maximum_length, end_index - start_index + 1)

        end_index += 1

    return maximum_length

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string of length n using a sliding window. While there's a potential inner loop to shrink the window, each character is added to and removed from the window at most once. Therefore, the number of operations is proportional to the length of the string, n, making the time complexity O(n).
Space Complexity
O(min(N, M))The described algorithm uses a window to find the longest substring without repeating characters. To efficiently check for repeating characters within the window, a hash map (or set) is used to store the characters currently present in the window. In the worst-case scenario, the hash map stores all the unique characters of the input string, N, where N is the length of the input string. However, the size of the hash map is also limited by the size of the character set, M, where M represents the number of possible characters. Therefore, the space complexity is O(min(N, M)).

Edge Cases

CaseHow to Handle
Empty stringReturn 0 if the input string is empty as there are no characters to form a substring.
String with only one characterReturn 1 if the string has only one character because that single character forms a substring of length 1.
String with all identical characters (e.g., 'aaaa')The algorithm should correctly identify the longest substring without repeating characters, which will be of length 1 in this case.
String with consecutive repeating characters (e.g., 'aab')Sliding window should move past the initial 'aa' and find 'ab' with length 2.
String with non-ASCII characters (e.g., Unicode)Ensure the character tracking data structure (e.g., hash map) can handle a wider range of characters and their indices correctly.
Long string approaching memory limitsSliding window approach uses constant space, so it should handle long strings until memory limits are actually hit.
String with maximum possible lengthCheck for integer overflow in length calculations and that the algorithm scales linearly O(n).
String with alternating repeating characters (e.g., 'abababa')The sliding window algorithm should efficiently find the longest non-repeating substring within the given pattern.