Given a string s
and an integer k
, find the length of the longest substring of s
that contains at most k
distinct characters.
For example:
s = "eceba"
and k = 2
, the longest substring is "ece"
with length 3.s = "aa"
and k = 1
, the longest substring is "aa"
with length 2.s = "abaccc"
and k = 2
, the longest substring is "abaccc"
with length 6.Explain your solution, including its time and space complexity. Consider edge cases such as an empty string or when k
is zero. Provide example code implementation in Python.
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach to finding the longest substring with at most K distinct characters involves checking every possible substring within the given string. We examine each substring to see if it meets the distinct character criteria, and keep track of the longest valid substring found. It's like trying every possible window size and position to see which one fits the rules and is the biggest.
Here's how the algorithm would work step-by-step:
def longest_substring_with_at_most_k_distinct_characters_brute_force(input_string, max_distinct_characters):
longest_substring_length = 0
for starting_index in range(len(input_string)):
for ending_index in range(starting_index, len(input_string)):
substring = input_string[starting_index:ending_index + 1]
# Count distinct chars
distinct_characters = set(substring)
# Check if distinct characters are less or equal to max
if len(distinct_characters) <= max_distinct_characters:
# Update if it's the longest substring
longest_substring_length = max(longest_substring_length, len(substring))
return longest_substring_length
To efficiently find the longest section of text with at most K unique characters, we use a 'window' that expands and contracts. This window represents a possible section of the text, and we adjust its size based on the number of unique characters inside it.
Here's how the algorithm would work step-by-step:
def longest_substring_with_at_most_k_distinct_characters(text_string, max_distinct_characters):
window_start = 0
max_length = 0
character_frequency = {}
for window_end in range(len(text_string)):
end_character = text_string[window_end]
if end_character not in character_frequency:
character_frequency[end_character] = 0
character_frequency[end_character] += 1
# Shrink the window if num distinct chars > k
while len(character_frequency) > max_distinct_characters:
start_character = text_string[window_start]
character_frequency[start_character] -= 1
# Remove the character if its frequency becomes zero
if character_frequency[start_character] == 0:
del character_frequency[start_character]
window_start += 1
# Update the max length
max_length = max(max_length, window_end - window_start + 1)
return max_length
Case | How to Handle |
---|---|
Null or empty input string | Return 0 immediately as there are no substrings to consider. |
k is 0 and input string is not empty | Return 0 immediately since no substring can have 0 distinct characters. |
String with length 1 | Return 1 if k >= 1, otherwise return 0. |
String with all identical characters | Return the length of the string if k >= 1, otherwise 0. |
k is larger than the number of distinct characters in the string | Return the length of the string as the entire string is a valid substring. |
Long input string to test for time complexity | Sliding window approach should scale linearly with input string length. |
String contains only distinct characters | The longest substring will be of length k or length of string which ever is smaller. |
Large k value that might cause integer overflow | K is defined as an integer; however, we don't use k in multiplication, so overflow not an issue. But validation for negative numbers should be included. |