Taro Logo

Sort Colors

Medium
Google logo
Google
1 view
Topics:
ArraysTwo Pointers

You are given an array nums containing n integers where each integer represents a color: 0 (red), 1 (white), or 2 (blue). Your task is to sort the array in-place such that all elements of the same color are adjacent, with the colors appearing in the order red, white, and blue. You are not allowed to use any library sorting functions.

For example:

  • nums = [2, 0, 2, 1, 1, 0] should be sorted to [0, 0, 1, 1, 2, 2].
  • nums = [2, 0, 1] should be sorted to [0, 1, 2].

Consider the following constraints:

  • n (the length of nums) is between 1 and 300 (inclusive).
  • Each element nums[i] is either 0, 1, or 2.

Can you implement an efficient algorithm to solve this sorting problem?

Follow-up question: Can you solve it using a one-pass algorithm with only constant extra space?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum size of the input array `nums`? Is it safe to assume it will fit in memory?
  2. Is the input array guaranteed to only contain the integers 0, 1, and 2, or could there be other integer values present?
  3. Should I handle the case where the input array `nums` is null or empty? If so, what is the expected behavior?
  4. Although the problem states to do it in-place, what is the expected return value? Is it void, or should I return the modified array?
  5. Is there a specific expected order for elements that are the same color (e.g. should all the 0's appear in the same order relative to each other as they were in the original array)?

Brute Force Solution

Approach

Imagine you have a bag of red, white, and blue marbles and want to sort them by color. The brute force way is to simply count how many of each color you have, without actually rearranging the marbles.

Here's how the algorithm would work step-by-step:

  1. First, go through the entire bag and count the number of red marbles.
  2. Next, do the same for the white marbles.
  3. Finally, count the blue marbles.
  4. Now, knowing how many of each color you have, you can rebuild the bag by first putting in all the red marbles, then all the white marbles, and lastly all the blue marbles.

Code Implementation

def sort_colors_brute_force(colors):
    red_count = 0
    white_count = 0
    blue_count = 0

    # Count the occurrences of each color
    for color in colors:
        if color == 0:
            red_count += 1
        elif color == 1:
            white_count += 1
        else:
            blue_count += 1

    # Overwrite the original array
    # Rebuild the array with sorted colors
    index = 0
    for _ in range(red_count):
        colors[index] = 0
        index += 1

    # Populate with white colors
    for _ in range(white_count):
        colors[index] = 1
        index += 1

    # Finally populate with blue
    for _ in range(blue_count):
        colors[index] = 2
        index += 1

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input array nums three times to count the occurrences of red, white, and blue marbles. Each iteration visits every element in the array once, which contributes a linear time complexity O(n). After counting, the algorithm rewrites the array based on the counts, which is another linear operation O(n). Since the constant factor is dropped in Big O notation, 3 * O(n) + O(n) simplifies to O(n).
Space Complexity
O(1)The provided solution counts the number of red, white, and blue marbles. This requires storing three integer variables (red_count, white_count, blue_count). The space used by these counters remains constant irrespective of the number of marbles N. Therefore, the auxiliary space complexity is O(1).

Optimal Solution

Approach

The most efficient way to sort these colors is to think of them as needing to be moved into their correct sections. We use three 'pointers' to keep track of where each color section should start.

Here's how the algorithm would work step-by-step:

  1. Imagine having three sections: the beginning for the first color, the middle for the second color, and the end for the third color. Initially, the middle section covers the whole range.
  2. Go through the list of colors, one at a time.
  3. If the color is the first color, swap it with the color at the beginning of the middle section, and shift both the first color section and middle section boundaries forward.
  4. If the color is the second color, it's already where it belongs relative to the beginning and end, so just move on to the next color, effectively shrinking the middle section.
  5. If the color is the third color, swap it with the color at the end of the middle section, and shrink the middle section from the end. Keep checking the swapped color until its in the right place.
  6. Repeat until the middle section disappears – meaning all colors are now in their correct sections.

Code Implementation

def sort_colors(colors):
    start_of_middle_section = 0
    end_of_middle_section = len(colors) - 1
    current_index = 0

    while current_index <= end_of_middle_section:
        if colors[current_index] == 0:
            # Move 0 to the beginning
            colors[current_index], colors[start_of_middle_section] = colors[start_of_middle_section], colors[current_index]

            start_of_middle_section += 1
            current_index += 1

        elif colors[current_index] == 1:
            # 1 is in the correct section
            current_index += 1

        else:
            # Move 2 to the end
            colors[current_index], colors[end_of_middle_section] = colors[end_of_middle_section], colors[current_index]
            end_of_middle_section -= 1

            # Re-evaluate swapped element
            continue

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the array of n elements using a single pass. While there is a swap operation inside the while loop when encountering the third color, each element is swapped at most once. Therefore, even with the swaps, the number of operations is directly proportional to the number of elements in the input array, making the time complexity O(n).
Space Complexity
O(1)The provided algorithm uses a fixed number of integer variables (pointers) to track the boundaries of the color sections. The number of these pointers does not depend on the size of the input array, N. No additional data structures like arrays, lists, or hash maps are used. Therefore, the auxiliary space required is constant, independent of the input size N.

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn immediately without modification since there's nothing to sort.
Array with one elementNo sorting is needed; the array is already sorted.
Array with only 0sAlgorithm should still execute correctly, leaving the array unchanged.
Array with only 1sAlgorithm should still execute correctly, leaving the array unchanged.
Array with only 2sAlgorithm should still execute correctly, leaving the array unchanged.
Array already sorted (0s, 1s, 2s)Algorithm should still execute correctly, without making unnecessary swaps.
Array in reverse sorted order (2s, 1s, 0s)Algorithm should handle this case efficiently by performing the necessary swaps.
Array containing a large number of elements with a skewed distribution (e.g., mostly 0s)The chosen algorithm (e.g., Dutch National Flag) should maintain good performance even with skewed distributions.