Partition Labels

Medium
7 days ago

You are given a string s. We want to partition the string into as many parts as possible so that each letter appears in at most one part. For example, the string "ababcc" can be partitioned into ["abab", "cc"], but partitions such as ["aba", "bcc"] or ["ab", "ab", "cc"] are invalid.

Note that the partition is done so that after concatenating all the parts in order, the resultant string should be s.

Return a list of integers representing the size of these parts.

Example 1:

Input: s = "ababcbacadefegdehijhklij"
Output: [9,7,8]
Explanation:
The partition is "ababcbaca", "defegde", "hijhklij".
This is a partition so that each letter appears in at most one part.
A partition like "ababcbacadefegde", "hijhklij" is incorrect, because it splits s into less parts.

Example 2:

Input: s = "eccbbbbdec"
Output: [10]

Constraints:

  • 1 <= s.length <= 500
  • s consists of lowercase English letters.
Sample Answer
def partitionLabels(s: str) -> list[int]:
    last_occurrence = {}
    for i, char in enumerate(s):
        last_occurrence[char] = i

    result = []
    start = 0
    end = 0
    for i, char in enumerate(s):
        end = max(end, last_occurrence[char])
        if i == end:
            result.append(end - start + 1)
            start = i + 1

    return result

Explanation:

The problem requires partitioning a string s into as many parts as possible such that each character appears in at most one part. The goal is to return a list of integers representing the size of these parts.

The provided Python code implements a greedy algorithm to solve this problem efficiently. Here's a breakdown:

  1. last_occurrence Dictionary:

    • The code first creates a dictionary last_occurrence to store the last index of each character in the string s. This is done to quickly look up the farthest occurrence of each character.
  2. Greedy Partitioning:

    • The algorithm then iterates through the string s again, maintaining two pointers: start and end. start represents the beginning of the current partition, and end represents the farthest reach of the current partition based on the characters seen so far.
    • For each character, the end pointer is updated to the maximum of its current value and the last occurrence of the current character.
    • If the current index i is equal to end, it means we have reached the end of the current partition. We append the length of the partition (end - start + 1) to the result list and update start to the beginning of the next partition.

Example:

For the input string s = "ababcbacadefegdehijhklij":

  1. The last_occurrence dictionary will be:

    {
        'a': 8, 'b': 5, 'c': 7, 'd': 14, 'e': 15,
        'f': 11, 'g': 13, 'h': 16, 'i': 18, 'j': 19,
        'k': 17, 'l': 20
    }
    
  2. The algorithm proceeds as follows:

    • start = 0, end = 0
    • i = 0, char = 'a', end = max(0, 8) = 8
    • i = 1, char = 'b', end = max(8, 5) = 8
    • ... (continues until i = 8)
    • i = 8, char = 'a', end = max(8, 8) = 8. Since i == end, result.append(8 - 0 + 1) = result.append(9), start = 9
    • i = 9, char = 'd', end = max(9, 14) = 14
    • ... (continues until i = 14)
    • i = 14, char = 'd', end = max(14, 14) = 14. Since i == end, result.append(14 - 9 + 1) = result.append(6), start = 15
    • i = 15, char = 'e', end = max(15, 15) = 15. Since i == end, result.append(15 - 15 + 1) = result.append(1), start = 16
    • ... (continues until i = 22)
    • i = 22, char = 'j', end = max(20, 20) = 20. Since i == end, result.append(22 - 16 + 1) = result.append(8), start = 23
  3. The final result will be [9, 7, 8]

Brute Force Solution (Less Efficient):

A brute-force solution would involve trying all possible partitions and checking if each partition is valid (i.e., no character appears in more than one part). This would be highly inefficient, especially for larger strings, due to the exponential number of possible partitions.

Time Complexity: O(n)

  • The code iterates through the string s twice: once to build the last_occurrence dictionary and once to find the partitions.
  • Both iterations take O(n) time, where n is the length of the string s.
  • The max operation inside the loop takes O(1) time.
  • Therefore, the overall time complexity is O(n) + O(n) = O(n).

Space Complexity: O(1)

  • The last_occurrence dictionary stores at most 26 characters (lowercase English letters). Thus, the space used by this dictionary is O(1) since it's constant regardless of the input string size.
  • The result list stores the lengths of the partitions. In the worst case, each character could be in its own partition, resulting in a list of size n. However, since there are only 26 distinct lowercase characters, the maximum number of partitions cannot exceed 26, making it also O(1).
  • The auxiliary variables start and end take O(1) space.
  • Therefore, the overall space complexity is O(1).