Taro Logo

Partition Labels

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+2
More companies
Profile picture
Profile picture
17 views
Topics:
ArraysStringsGreedy Algorithms

You are given a string s. We want to partition the string into as many parts as possible so that each letter appears in at most one part. For example, the string "ababcc" can be partitioned into ["abab", "cc"], but partitions such as ["aba", "bcc"] or ["ab", "ab", "cc"] are invalid.

Note that the partition is done so that after concatenating all the parts in order, the resultant string should be s.

Return a list of integers representing the size of these parts.

Example 1:

Input: s = "ababcbacadefegdehijhklij"
Output: [9,7,8]
Explanation:
The partition is "ababcbaca", "defegde", "hijhklij".
This is a partition so that each letter appears in at most one part.
A partition like "ababcbacadefegde", "hijhklij" is incorrect, because it splits s into less parts.

Example 2:

Input: s = "eccbbbbdec"
Output: [10]

Constraints:

  • 1 <= s.length <= 500
  • s consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected range of characters in the input string? Is it limited to lowercase English letters, or could it include uppercase letters, numbers, or other characters?
  2. Can the input string be empty or null? If so, what should the expected output be in those cases?
  3. If a character appears multiple times in the string, should the partition include *all* occurrences of that character, even if they are not contiguous?
  4. Is there a limit on the number of partitions I should aim to create, or any preference between creating more or fewer partitions if multiple valid solutions exist?
  5. Could you provide a small example input and its corresponding expected output to ensure I fully understand the problem requirements?

Brute Force Solution

Approach

The brute force strategy for partitioning labels involves exploring every single possible way to divide the given string. We will check each possible partition to see if it meets the problem's requirement that each letter appears in only one partition. Ultimately, we want to find all valid partitions and select the shortest ones.

Here's how the algorithm would work step-by-step:

  1. Start by considering a partition with only the first character.
  2. Check if any other occurrences of that first character exist later in the string.
  3. If there are, extend the partition to include those later occurrences and all characters in between.
  4. Now, consider the next character after this initial partition as the start of a new potential partition.
  5. Repeat the process of checking for later occurrences of this new character and extending the partition if needed.
  6. Keep doing this until you have gone through the entire string, finding every possible valid way to divide it into partitions where each character appears in only one partition.
  7. From all the valid ways, the answer is the sizes of each partition in the valid partitioning.

Code Implementation

def partition_labels_brute_force(input_string):
    partition_sizes = []
    start_index = 0

    while start_index < len(input_string):
        end_index = start_index

        # Expand the partition to include all occurrences of the character
        last_occurrence = input_string.rfind(input_string[start_index])
        end_index = max(end_index, last_occurrence)

        current_index = start_index + 1
        while current_index < end_index:
            # Find if any character in the current partition extends it
            last_occurrence = input_string.rfind(input_string[current_index])
            end_index = max(end_index, last_occurrence)
            current_index += 1

            #This is the end of loop for extending partition

        #Record partition size after expansion
        partition_sizes.append(end_index - start_index + 1)
        start_index = end_index + 1

    return partition_sizes

Big(O) Analysis

Time Complexity
O(n³)The brute force approach iterates through the string of length n, potentially starting a new partition at each index. For each such potential partition, it searches the rest of the string to extend the partition to include all occurrences of characters within that partition. This nested searching contributes a factor of n * n. Finally, checking the validity of a partition also involves iterating through the characters within that partition which also takes n time. Therefore the time complexity is O(n*n*n) which simplifies to O(n³).
Space Complexity
O(1)The described brute force approach implicitly explores potential partitions without persistently storing intermediate results for all possibilities. While it mentions checking for later occurrences and extending partitions, this appears to be done using a few index variables to track the current partition's boundaries, not by allocating memory to hold multiple partition configurations. The space used depends on the number of variables needed to hold character positions, which is a fixed number independent of the input string's length (N). Therefore, the space complexity is O(1).

Optimal Solution

Approach

The key to this puzzle is figuring out how far each letter stretches in the string. We efficiently group letters together by making sure all occurrences of each letter are within the same group, creating the largest possible groups from left to right.

Here's how the algorithm would work step-by-step:

  1. First, for each unique letter, find the very last spot it appears in the whole string.
  2. Start from the beginning of the string, and build a group of letters, one by one.
  3. As you add a letter to the group, check its 'last spot'. If that 'last spot' is further than the current end of your group, extend the group to include that spot.
  4. Keep extending the group's end until you've processed all the letters within that initial group.
  5. Once you reach the end of your group (meaning the 'last spot' of every letter in the group is within the group itself), you've found a complete segment.
  6. Record the size of this segment and then start building the next segment from where you left off.
  7. Repeat until you have reached the end of the whole string. Each segment represents one label.

Code Implementation

def partition_labels(input_string):
    last_occurrence = {}
    for index, char in enumerate(input_string):
        last_occurrence[char] = index

    partition_sizes = []
    start_index = 0
    end_index = 0

    for index, char in enumerate(input_string):
        # Extend the end of the current partition
        end_index = max(end_index, last_occurrence[char])

        if index == end_index:
            # Found a complete partition
            partition_size = end_index - start_index + 1
            partition_sizes.append(partition_size)
            start_index = index + 1

    return partition_sizes

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the string once to determine the last occurrence of each character, which takes O(n) time. Then, it iterates through the string again to partition the labels. Within this second loop, it checks the last occurrences of characters which effectively involves constant-time lookups using the precomputed last occurrence map. Therefore, the dominant cost is the single pass through the string. Thus, the overall time complexity is O(n).
Space Complexity
O(1)The algorithm utilizes a hash map (or array) to store the last occurrence of each character in the input string. Since there are only 26 possible English alphabet characters, the hash map will have at most 26 entries, taking constant space. The remaining variables used (start, end, and result list) do not scale with the input string size N; rather, the result list size depends on the number of partitions, which in the worst case could be N, but the auxiliary space remains constant due to the fixed size of the character map (or array). Thus, the auxiliary space is O(1).

Edge Cases

CaseHow to Handle
Null or empty input stringReturn an empty list immediately, as there are no partitions to create.
String with only one characterReturn a list containing only the length of the string (which is 1) because the single character is its own partition.
String where all characters are identicalReturn a list containing only the length of the entire string, since the whole string is the only possible partition.
String with maximum possible length (e.g., close to memory limits)Ensure the chosen data structures (e.g., hash map for last occurrence) can accommodate the size, and the algorithm's time complexity remains efficient (ideally O(n)).
String where one character appears only at the beginningThe partition will extend to include the last occurrence of every character within the initial single character's range, correctly encompassing all necessary characters.
String where one character appears only at the endThe partition will expand backward to include all characters relevant to the partition which will work correctly, given how partitions are found.
Overlapping partition ranges requiring mergingThe greedy approach will correctly merge the overlapping ranges by always extending current partition to the furthest right position of any character within the current partition.
String with a very large number of unique charactersThe hash map may consume a considerable amount of memory, but the algorithm should still function correctly if sufficient memory is available.