Given an array of characters chars
, compress it using the following algorithm:
Begin with an empty string s
. For each group of consecutive repeating characters in chars
:
1
, append the character to s
.The compressed string s
should not be returned separately, but instead, be stored in the input character array chars
. Note that group lengths that are 10
or longer will be split into multiple characters in chars
.
After you are done modifying the input array, return the new length of the array.
You must write an algorithm that uses only constant extra space.
Example 1:
Input: chars = ["a","a","b","b","c","c","c"] Output: Return 6, and the first 6 characters of the input array should be: ["a","2","b","2","c","3"] Explanation: The groups are "aa", "bb", and "ccc". This compresses to "a2b2c3".
Example 2:
Input: chars = ["a"] Output: Return 1, and the first character of the input array should be: ["a"] Explanation: The only group is "a", which remains uncompressed since it's a single character.
Example 3:
Input: chars = ["a","b","b","b","b","b","b","b","b","b","b","b","b"] Output: Return 4, and the first 4 characters of the input array should be: ["a","b","1","2"]. Explanation: The groups are "a" and "bbbbbbbbbbbb". This compresses to "ab12".
Constraints:
1 <= chars.length <= 2000
chars[i]
is a lowercase English letter, uppercase English letter, digit, or symbol.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The basic idea is to go through the input string character by character and count consecutive repeating characters. We then build a new string by replacing each repeating sequence with the character and its count.
Here's how the algorithm would work step-by-step:
def string_compression(input_string):
compressed_string = ""
string_length = len(input_string)
index = 0
while index < string_length:
current_character = input_string[index]
count = 1
# Count repeating characters
while (index + 1 < string_length and
input_string[index] == input_string[index + 1]):
count += 1
index += 1
# Append the character and its count to the new string
compressed_string += current_character + str(count)
index += 1
# Return compressed string if shorter
if len(compressed_string) < string_length:
return compressed_string
else:
# Otherwise, return the original string
return input_string
The goal is to shorten a string by replacing consecutive repeating characters with the character and its count. We can do this efficiently by keeping track of the current character and its frequency as we go through the string. If the frequency is more than one we will append it to the string.
Here's how the algorithm would work step-by-step:
def string_compression(input_string):
if not input_string:
return input_string
compressed_string = ''
current_char = input_string[0]
char_count = 1
for i in range(1, len(input_string)):
if input_string[i] == current_char:
char_count += 1
else:
# Append the previous character and its count.
compressed_string += current_char
if char_count > 1:
compressed_string += str(char_count)
current_char = input_string[i]
char_count = 1
# Append the last character and its count.
compressed_string += current_char
# We append count if it's greater than 1
if char_count > 1:
compressed_string += str(char_count)
# Compare lengths and return shorter string
if len(compressed_string) < len(input_string):
return compressed_string
else:
return input_string
Case | How to Handle |
---|---|
Empty input array | Return 0, as there are no characters to compress. |
Input array with a single character | Return 1, as the character is simply copied to the beginning of the array without compression. |
Input array with two different characters | Return 2, because no compression is possible and both are placed in the beginning of the array. |
Input array with all identical characters (e.g., ['a', 'a', 'a', 'a', 'a']) | Correctly compresses the array to ['a', '5'] and returns 2. |
Input array with repeating characters followed by non-repeating characters (e.g., ['a', 'a', 'b', 'c']) | The algorithm should handle the first run of repeating characters and correctly place the next characters after the compressed portion. |
Input array with counts exceeding single digits (e.g., 10 or more) | The count needs to be split into individual characters, placed after the repeated character in the array. |
Input contains mixed case alphabet characters only | Treat each character independently, compressing consecutive repeats. |
Large input array with many repeated characters, potential for performance issues | Ensure the solution uses an efficient in-place algorithm to avoid unnecessary memory allocation or copying, optimizing for time complexity. |