Given a string s
, return the last substring of s
in lexicographical order.
For example:
Example 1:
Input: s = "abab"
Output: "bab"
Explanation: The substrings are ["a", "ab", "aba", "abab", "b", "ba", "bab"]. The lexicographically maximum substring is "bab".
Example 2:
Input: s = "leetcode"
Output: "tcode"
Constraints:
1 <= s.length <= 4 * 10^5
s
contains only lowercase English letters.Can you provide an algorithm with optimal time and space complexity to solve this problem? Explain your solution and the time/space complexity analysis.
A naive approach involves generating all possible substrings of the given string s
and then comparing them lexicographically to find the last (largest) substring.
Algorithm:
s
.max_substring
to an empty string.max_substring
, update max_substring
.max_substring
.Code (Python):
def last_substring_brute_force(s):
max_substring = ""
for i in range(len(s)):
for j in range(i + 1, len(s) + 1):
substring = s[i:j]
if substring > max_substring:
max_substring = substring
return max_substring
Time Complexity: O(n^3), where n is the length of the string s
. Generating all substrings takes O(n^2) time, and comparing each substring takes O(n) in the worst case.
Space Complexity: O(n^2), because, in the worst-case scenario, storing all the substrings would require O(n^2) space.
Instead of generating all possible substrings, we can compare suffixes of the string directly. This is more efficient.
Algorithm:
max_index
to 0.s
from index 1 to n-1
:
max_index
with the suffix starting at the current index i
.i
is greater, update max_index
to i
.s
starting at max_index
.Code (Python):
def last_substring_optimal(s):
max_index = 0
for i in range(1, len(s)):
if s[i:] > s[max_index:]:
max_index = i
return s[max_index:]
Time Complexity: O(n^2) in the worst case. The loop iterates n
times, and the string comparison s[i:] > s[max_index:]
can take up to O(n) time in the worst case (e.g., when the suffixes are very similar). However, on average, this is more efficient than the brute force because not all substrings are generated and compared. There is a linear solution as well, which is described in the next approach.
Space Complexity: O(1). We only use a constant amount of extra space.
This method employs a two-pointer technique to determine the lexicographically largest suffix. It's inspired by the construction of suffix arrays but doesn't build the entire array.
Algorithm:
i
and j
, to 0 and 1 respectively.j
is within the bounds of the string:
s[i + offset]
with s[j + offset]
where offset
starts from 0.i + offset
or j + offset
reaches the end of the string, or s[i + offset] != s[j + offset]
:
s[i + offset] > s[j + offset]
or j + offset
reached end of string, advance j
to j + offset + 1
.i
to j
, advance j
to i + 1
.offset
to 0.offset
.i
.Code (Python):
def last_substring_linear(s):
i, j, offset = 0, 1, 0
n = len(s)
while j + offset < n:
if s[i + offset] == s[j + offset]:
offset += 1
else:
if s[i + offset] > s[j + offset]:
j += offset + 1
else:
i = j
j = i + 1
offset = 0
if j >= n:
break
return s[i:]
Time Complexity: O(n). Although there is a nested while
structure implicitly through the offset
, in effect, each character is visited at most a constant number of times.
Space Complexity: O(1). We use only constant extra space.