Taro Logo

Last Substring in Lexicographical Order

Hard
MathWorks logo
MathWorks
2 views
Topics:
StringsTwo Pointers

Given a string s, return the last substring of s in lexicographical order.

Example 1:

Input: s = "abab"
Output: "bab"
Explanation: The substrings are ["a", "ab", "aba", "abab", "b", "ba", "bab"]. The lexicographically maximum substring is "bab".

Example 2:

Input: s = "leetcode"
Output: "tcode"

Constraints:

  • 1 <= s.length <= 4 * 105
  • s contains only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string s?
  2. Can the input string s contain only lowercase English letters?
  3. If the input string is empty, what should I return?
  4. By 'substring', do you mean a contiguous sequence of characters within the string s?
  5. If there are multiple substrings that are lexicographically the last, should I return the first occurrence of such substring or any of them?

Brute Force Solution

Approach

We want to find the very last substring if we were to alphabetize all the substrings of a given string. The brute force way is to simply generate every possible substring and then compare them all to find the largest.

Here's how the algorithm would work step-by-step:

  1. First, consider every single letter of the original string. Each of these is a substring.
  2. Next, consider every pair of consecutive letters. These are also substrings.
  3. Then, consider every group of three consecutive letters. These are substrings too.
  4. Keep doing this, increasing the length of the group of letters you're considering, until you consider the entire original string itself.
  5. Now you have a list of every possible substring of the original string.
  6. Go through this list, comparing each substring to the one you currently think is the largest (alphabetically).
  7. If you find a substring that comes later in the alphabet than the current largest, replace the current largest with this new substring.
  8. After you've checked every substring in the list, the one you're left with is the very last substring in alphabetical order.

Code Implementation

def last_substring_brute_force(input_string):
    largest_substring = ""

    # Iterate through all possible substring lengths
    for substring_length in range(1, len(input_string) + 1):
        # Iterate through all possible starting positions for each length
        for start_index in range(len(input_string) - substring_length + 1):
            current_substring = input_string[start_index:start_index + substring_length]

            # Compare the current substring with the largest found so far
            if current_substring > largest_substring:

                # Update if lexicographically larger
                largest_substring = current_substring

    return largest_substring

Big(O) Analysis

Time Complexity
O(n³)The algorithm generates all possible substrings of the input string of length n. Generating each substring takes O(1) time on average, but there are n(n+1)/2 such substrings, which is O(n²). Then, comparing each substring to the current largest substring can take up to O(n) time in the worst case. Since we iterate through all O(n²) substrings and perform an O(n) comparison for each, the overall time complexity is O(n² * n) which simplifies to O(n³).
Space Complexity
O(N^2)The brute force approach generates all possible substrings of the input string. In the worst-case scenario, where the input string has length N, there can be N(N+1)/2 substrings. Storing all these substrings, even if just their references or start/end indices, would require space proportional to the number of substrings. Therefore, the auxiliary space used is proportional to N squared, giving a space complexity of O(N^2).

Optimal Solution

Approach

The goal is to find the alphabetically largest chunk of text within the input text. The most efficient approach avoids comparing every possible chunk by cleverly tracking potential candidates and comparing only when necessary, focusing on the defining characteristics of alphabetical order.

Here's how the algorithm would work step-by-step:

  1. Start by assuming the last character in the text is the beginning of the largest chunk.
  2. Go through the rest of the text from the beginning.
  3. For each character, compare the chunk of text that starts at that character with the current largest chunk.
  4. If the new chunk is larger alphabetically, update the largest chunk to start at this new character.
  5. If the chunks are the same, keep the one that appears earlier in the text.
  6. If the new chunk is smaller alphabetically, continue to the next character.
  7. After checking all characters, the remaining 'largest chunk' is the alphabetically last substring.

Code Implementation

def last_substring(input_string):
    largest_substring_index = len(input_string) - 1

    for current_index in range(len(input_string) - 1):
        # Compare current substring with the largest substring.
        if input_string[current_index:] > input_string[largest_substring_index:]:
            largest_substring_index = current_index

        # Prioritize earlier occurence if substrings are equal
        elif input_string[current_index:] == input_string[largest_substring_index:]:
            continue

    # Returning the actual substring is required
    return input_string[largest_substring_index:]

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input string of length n once. In each iteration, it compares the substring starting at the current index with the current largest substring. The comparison itself could take up to n steps in the worst case (when the initial parts of the substrings being compared are identical). However, crucially, the comparison doesn't restart from the beginning of the strings in each iteration. The key is that the algorithm intelligently updates the starting index of the lexicographically largest substring. This single pass through the string with comparisons that, in aggregate, don't exceed a linear amount of work, results in O(n) time complexity, not O(n^2). Although a single comparison COULD be O(n), the comparisons are structured so that the TOTAL work across the entire loop is O(n).
Space Complexity
O(1)The algorithm primarily uses a single variable (or a few variables) to store the index of the current largest substring. There are no auxiliary data structures, such as arrays, lists, or hash maps, used to store intermediate results that scale with the input size, N (the length of the input string). Therefore, the space required remains constant regardless of the input string's length. The space complexity is O(1), indicating constant auxiliary space.

Edge Cases

CaseHow to Handle
Empty string sReturn an empty string since there are no substrings.
String s with length 1Return the string itself as it is the only substring and hence the largest.
String s with all identical characters (e.g., 'aaaa')Return the entire string as it is the lexicographically largest substring.
Maximum length string (performance implications)Ensure the chosen algorithm (e.g., optimized two-pointer approach) has acceptable time complexity, preferably O(n) or close to it, to avoid timeouts.
String with characters near the end of the ASCII table (e.g., 'xyz')The algorithm should correctly handle characters with high ASCII values.
String with repeating patterns (e.g., 'ababab')The algorithm should correctly identify the last substring even with repeating patterns.
String with a single extremely large character at the end (e.g., 'abcxyz')The algorithm correctly identifies the 'xyz' as the greatest substring
String containing non-ASCII characters (extended ASCII or Unicode)Ensure the comparison logic correctly handles non-ASCII characters based on their Unicode code points.