Taro Logo

Longest Substring Without Repeating Characters #6 Most Asked

Medium
Topics:
StringsTwo PointersSliding Windows

Given a string s, find the length of the longest substring without duplicate characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Constraints:

  • 0 <= s.length <= 5 * 104
  • s consists of English letters, digits, symbols and spaces.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the character set of the input string? For example, is it limited to ASCII, lowercase English letters, or can it include Unicode characters?
  2. What should I return for an empty input string?
  3. Could the input string `s` be null or undefined, and if so, how should that be handled?
  4. Are there any constraints on the length of the input string `s` that I should be aware of?
  5. Just to confirm, are we considering case-sensitivity? For example, would 'a' and 'A' be treated as the same character or different characters?

Brute Force Solution

Approach

The brute force strategy is to look at every single possible continuous piece of text, one by one. For each piece, we'll check if it has any repeated characters. We'll keep track of the longest one we've found so far that has no repeats.

Here's how the algorithm would work step-by-step:

  1. First, consider all the possible text segments we can make.
  2. Start by looking at every segment that begins with the first character.
  3. Then, look at every segment that begins with the second character, and so on, until you've examined all possible starting points.
  4. For each of these segments, check if all the characters within it are unique.
  5. To do this, just go through the segment and see if any character appears more than once.
  6. If a segment contains only unique characters, measure its length.
  7. Remember the length of the longest segment with no repeating characters you've seen so far.
  8. After checking every single possible segment, the length you remembered will be the answer.

Code Implementation

def length_of_longest_substring_brute_force(input_string):
    max_length_so_far = 0
    number_of_characters = len(input_string)

    # This outer loop defines the starting point for every possible substring.
    for start_index in range(number_of_characters):
        
        # This inner loop defines the end point, thus generating all substrings from start_index.
        for end_index in range(start_index, number_of_characters):
            substring = input_string[start_index:end_index + 1]
            seen_characters = set()
            has_duplicate = False
            
            for character in substring:
                
                # We use a set to efficiently check for duplicates within the current substring.
                if character in seen_characters:
                    has_duplicate = True
                    break
                seen_characters.add(character)

            if not has_duplicate:
                max_length_so_far = max(max_length_so_far, len(substring))
    
    return max_length_so_far

Big(O) Analysis

Time Complexity
O(n³)The time complexity is driven by three nested operations for an input string of size n. The outer loop iterates through all possible starting characters (n possibilities), and a nested loop iterates through all possible ending characters for each start, which generates all O(n²) substrings. For each of these substrings, a third operation is required to scan its characters and check for duplicates, which can take up to O(n) time. Therefore, the total number of operations is proportional to n * n * n, which simplifies to a time complexity of O(n³).
Space Complexity
O(K)The dominant factor in space complexity comes from checking if a segment has unique characters. For each segment, we need a way to track the characters we have seen within that specific segment. This is often done using an auxiliary data structure like a hash set or a frequency map. In the worst case, the longest possible substring without repeating characters could consist of all unique characters in the input alphabet. Let K be the size of the character set (e.g., 256 for extended ASCII); the space required for this tracking data structure is proportional to K, making the space complexity constant relative to the input string length N.

Optimal Solution

Approach

Imagine you have an expanding window that slides over the text. The goal is to stretch this window as wide as possible to cover a piece of text with no duplicate characters. We keep track of the largest window size we've seen so far.

Here's how the algorithm would work step-by-step:

  1. Start by placing the left and right sides of your window at the very beginning of the text.
  2. Now, start expanding the window by moving its right side one character at a time.
  3. As you expand, keep a record of every unique character currently inside your window.
  4. If the next character you encounter is already in your window, you have a repeat. This means the window can't get any bigger from this starting point.
  5. When you find a repeat, you must shrink the window from the left side. Keep moving the left side inward until the original repeated character is no longer inside the window.
  6. Once the repeat is gone, you can start expanding the right side of the window again.
  7. Every time you successfully expand the window, check if its current size is the biggest you've seen yet. If it is, remember this new record length.
  8. Continue this process of expanding and shrinking the window until you've moved all the way through the text. The record length you have at the end is your answer.

Code Implementation

def length_of_longest_substring(input_string):
    characters_in_window = set()
    left_pointer = 0
    max_length = 0

    for right_pointer in range(len(input_string)):
        current_character = input_string[right_pointer]

        # If a character repeats, we must shrink the window from the left until the duplicate is removed.
        while current_character in characters_in_window:
            character_to_remove = input_string[left_pointer]
            characters_in_window.remove(character_to_remove)
            left_pointer += 1

        # Add the new character to the window, effectively expanding it.
        characters_in_window.add(current_character)

        # After each valid expansion, check if we have a new longest substring.
        current_window_size = right_pointer - left_pointer + 1
        max_length = max(max_length, current_window_size)

    return max_length

Big(O) Analysis

Time Complexity
Space Complexity

Edge Cases

Input string is empty
How to Handle:
The algorithm should return 0 as there are no substrings.
Input string contains only one character
How to Handle:
The longest substring is the string itself, so the function should return 1.
Input string contains all unique characters
How to Handle:
The entire string is the longest substring without repeating characters, so its length should be returned.
Input string contains all identical characters
How to Handle:
The longest substring without repeating characters will always have a length of 1.
Input string contains special characters, spaces, or numbers
How to Handle:
The solution should handle any valid character from the character set correctly by tracking its last seen position.
The longest substring appears at the very beginning of the string
How to Handle:
The sliding window correctly finds the maximum length early and maintains it throughout the remaining scan.
The longest substring appears at the very end of the string
How to Handle:
The algorithm's final check after the loop completes ensures the last potential substring is considered for the maximum length.
Maximum-sized input string (e.g., 5 * 10^4 characters)
How to Handle:
An efficient O(N) sliding window approach ensures the solution runs within time limits without performance degradation.
0/0 completed