Taro Logo

Remove Duplicates from Sorted Array II

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+6
14 views
Topics:
ArraysTwo Pointers

Given an integer array nums sorted in non-decreasing order, remove some duplicates in-place such that each unique element appears at most twice. The relative order of the elements should be kept the same.

Since it is impossible to change the length of the array in some languages, you must instead have the result be placed in the first part of the array nums. More formally, if there are k elements after removing the duplicates, then the first k elements of nums should hold the final result. It does not matter what you leave beyond the first k elements.

Return k after placing the final result in the first k slots of nums.

Do not allocate extra space for another array. You must do this by modifying the input array in-place with O(1) extra memory.

Custom Judge:

The judge will test your solution with the following code:

int[] nums = [...]; // Input array
int[] expectedNums = [...]; // The expected answer with correct length

int k = removeDuplicates(nums); // Calls your implementation

assert k == expectedNums.length;
for (int i = 0; i < k; i++) {
    assert nums[i] == expectedNums[i];
}

If all assertions pass, then your solution will be accepted.

Example 1:

Input: nums = [1,1,1,2,2,3]
Output: 5, nums = [1,1,2,2,3,_]
Explanation: Your function should return k = 5, with the first five elements of nums being 1, 1, 2, 2 and 3 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Example 2:

Input: nums = [0,0,1,1,1,1,2,3,3]
Output: 7, nums = [0,0,1,1,2,3,3,_,_]
Explanation: Your function should return k = 7, with the first seven elements of nums being 0, 0, 1, 1, 2, 3 and 3 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Constraints:

  • 1 <= nums.length <= 3 * 104
  • -104 <= nums[i] <= 104
  • nums is sorted in non-decreasing order.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of integer values within the input array nums?
  2. What should I return if the input array is null or empty?
  3. Is the input array guaranteed to be sorted in non-decreasing order, or do I need to handle unsorted inputs?
  4. Can I modify the original input array directly, or do I need to create a copy?
  5. Besides modifying the array in-place, what is the expected return type (e.g., integer, array)? Specifically, if there are `k` elements after processing, should I return the integer `k`?

Brute Force Solution

Approach

Imagine you have a list of numbers sorted from smallest to largest, and you want to remove duplicates but allow each number to appear at most twice. The brute force strategy checks every possible combination of numbers, keeping only those that meet the 'at most twice' rule for each number.

Here's how the algorithm would work step-by-step:

  1. Create a new, empty list to store the results.
  2. Start by considering each number from the original list one at a time.
  3. For each number, check how many times it already appears in the new list.
  4. If the number appears less than two times in the new list, add it.
  5. If the number already appears two times, skip it and move to the next number in the original list.
  6. Continue this process for every number in the original list until you've considered them all.
  7. The new list now contains the numbers from the original list, with no number appearing more than twice, and it is your final answer.

Code Implementation

def remove_duplicates_brute_force(numbers):
    new_list = []

    for number in numbers:
        # Check frequency of number in the new list
        frequency = new_list.count(number)

        # Add the number if it appears less than twice
        if frequency < 2:
            new_list.append(number)

    return new_list

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through each of the n elements in the original array. For each element, it checks its frequency in the new array. Checking the frequency in the new array, which can grow up to size n, involves another iteration. Thus, for each of the n elements in the original array, we potentially iterate through n elements in the new array to count occurrences. This results in approximately n * n operations, which simplifies to O(n²).
Space Complexity
O(N)The provided solution creates a new list to store the results. In the worst-case scenario, where the input array contains no duplicate values, the new list will store all N elements of the original list. Therefore, the auxiliary space used by this algorithm is proportional to the size of the input array, resulting in O(N) space complexity.

Optimal Solution

Approach

The goal is to modify the existing sequence so that no number appears more than twice, keeping the sequence sorted. The clever shortcut is to overwrite the older part of the sequence with the new, corrected one as you go, using only a small, fixed amount of extra memory.

Here's how the algorithm would work step-by-step:

  1. Start by looking at the very beginning of the sequence.
  2. Keep a 'marker' that points to where the next non-duplicate number should be placed.
  3. Go through each number in the sequence, one by one.
  4. For each number, check if it's a duplicate. A number is a duplicate if it's the same as the two numbers right before our 'marker'.
  5. If the number is not a duplicate, put it at the position indicated by our 'marker' and then move the 'marker' forward one position.
  6. After going through all the numbers in the sequence, the portion before the 'marker' will contain the modified sequence with no more than two of the same number.
  7. The length of the new, modified sequence is simply the position of our 'marker'.

Code Implementation

def remove_duplicates(numbers):
    if not numbers:
        return 0

    # Use insert_position to track the index for the next unique element.
    insert_position = 0

    for number in numbers:
        # Check if the current number is a duplicate.
        if insert_position < 2 or number != numbers[insert_position - 2]:
            # Overwrite the array at insert_position with the current number.
            numbers[insert_position] = number

            # Only increment if we've placed a unique number
            insert_position += 1

    return insert_position

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input array of size n exactly once. Inside the loop, it performs a constant-time comparison to determine if the current element is a duplicate based on the two elements preceding the 'marker'. The 'marker' increment is also a constant-time operation. Therefore, the overall time complexity is directly proportional to the number of elements in the array, resulting in O(n) time complexity.
Space Complexity
O(1)The algorithm uses a 'marker' (index) to track the position for the next non-duplicate number. No additional data structures like lists, hash maps, or recursion are employed. The space required remains constant irrespective of the input array's size (N). Thus, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Empty input arrayReturn 0 immediately, as there are no elements to process.
Null input arrayThrow an IllegalArgumentException or return 0 after validating the argument to prevent NullPointerException.
Array with one elementReturn 1, as a single element always satisfies the condition.
Array with two elements that are differentReturn 2, as two different elements also satisfy the condition.
Array with two elements that are the sameReturn 2, as the same elements can appear at most twice.
Array with all elements identicalThe solution should correctly condense the array to have only two occurrences of the repeated number and return 2.
Array with negative numbers, positive numbers, and zerosThe solution should correctly handle the negative numbers, positive numbers, and zeros while maintaining the sorted order and limiting duplicates.
Array with a mix of elements where some are repeated more than twice, and some appear once or twiceThe algorithm must correctly filter those appearing more than twice, maintaining correct ordering and limiting the total count.