Given a string s
, remove duplicate letters so that every letter appears once and only once. You must make sure your result is the smallest in lexicographical order among all possible results.
Example 1:
Input: s = "bcabc" Output: "abc"
Example 2:
Input: s = "cbacdcbc" Output: "acdb"
Constraints:
1 <= s.length <= 10^4
s
consists of lowercase English letters.Note: This question is the same as 1081: https://leetcode.com/problems/smallest-subsequence-of-distinct-characters/
## Remove Duplicate Letters
This problem asks us to remove duplicate letters from a string such that each letter appears only once, and the resulting string is the smallest in lexicographical order among all possible results.
### 1. Brute Force Solution (and why it's not ideal)
A brute-force approach would involve generating all possible subsequences of the given string, filtering out those that contain duplicate letters, and then finding the smallest subsequence in lexicographical order. However, this approach is highly inefficient because the number of subsequences grows exponentially with the length of the input string. This makes it impractical for even moderately sized strings.
### 2. Optimal Solution: Greedy Approach with Stack
A more efficient solution involves using a greedy approach combined with a stack. The idea is to maintain a stack of characters that represent a candidate for the smallest lexicographical subsequence. As we iterate through the input string, we make the following decisions:
1. If the current character is not in the stack, we check if it can replace any of the characters already in the stack.
2. A character in the stack can be replaced if it is greater than the current character and it appears later in the string.
3. We use a `last_occurrence` map to efficiently determine if a character appears later in the string.
Here's the Python code:
```python
def remove_duplicate_letters(s: str) -> str:
last_occurrence = {}
for i, char in enumerate(s):
last_occurrence[char] = i
stack = []
seen = set()
for i, char in enumerate(s):
if char not in seen:
while stack and char < stack[-1] and i < last_occurrence[stack[-1]]:
seen.remove(stack.pop())
stack.append(char)
seen.add(char)
return ''.join(stack)
# Example usage:
s = "bcabc"
result = remove_duplicate_letters(s)
print(f"Input: {s}, Output: {result}") # Output: abc
s = "cbacdcbc"
result = remove_duplicate_letters(s)
print(f"Input: {s}, Output: {result}") # Output: acdb
The run-time complexity of this solution is O(n), where n is the length of the input string s
. This is because we iterate through the string once to populate the last_occurrence
map and then iterate through it again to process each character. The stack operations (push and pop) take constant time. The while
loop inside the main loop might seem like it would increase the complexity, but each character can only be added and removed from the stack once. Thus, the while
loop contributes at most O(n) operations over the entire execution of the algorithm.
The space complexity of this solution is O(1), because the last_occurrence
map, the stack
, and the seen
set can contain at most 26 characters (the number of lowercase English letters). Therefore, the space used by these data structures is constant and does not depend on the size of the input string.
seen
set.