Taro Logo

Remove Duplicates from Sorted Array

Easy
Bloomberg LP logo
Bloomberg LP
2 views
Topics:
ArraysTwo Pointers

Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place such that each unique element appears only once. The relative order of the elements should be kept the same. Then return the number of unique elements in nums.

Consider the number of unique elements of nums to be k, to get accepted, you need to do the following things:

  • Change the array nums such that the first k elements of nums contain the unique elements in the order they were present in nums initially. The remaining elements of nums are not important as well as the size of nums.
  • Return k.

Custom Judge:

The judge will test your solution with the following code:

int[] nums = [...]; // Input array
int[] expectedNums = [...]; // The expected answer with correct length

int k = removeDuplicates(nums); // Calls your implementation

assert k == expectedNums.length;
for (int i = 0; i < k; i++) {
    assert nums[i] == expectedNums[i];
}

If all assertions pass, then your solution will be accepted.

Example 1:

Input: nums = [1,1,2]
Output: 2, nums = [1,2,_]
Explanation: Your function should return k = 2, with the first two elements of nums being 1 and 2 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Example 2:

Input: nums = [0,0,1,1,1,2,2,3,3,4]
Output: 5, nums = [0,1,2,3,4,_,_,_,_,_]
Explanation: Your function should return k = 5, with the first five elements of nums being 0, 1, 2, 3, and 4 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Constraints:

  • 1 <= nums.length <= 3 * 104
  • -100 <= nums[i] <= 100
  • nums is sorted in non-decreasing order.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of integer values within the `nums` array?
  2. Can the input array `nums` be empty or null?
  3. Is the input array guaranteed to be sorted in ascending order?
  4. Does the problem require that the original array `nums` be modified in-place, or can I use auxiliary space?
  5. If all elements in `nums` are duplicates, what value should I return?

Brute Force Solution

Approach

The simplest way to remove duplicates from a sorted list is to check each value against all the others. We will keep track of the unique values and build a new list containing only those.

Here's how the algorithm would work step-by-step:

  1. Take the first value from the original list.
  2. Check if this value is already in the list of unique values. Initially, this list is empty.
  3. If the value is not in the unique values list, add it.
  4. Now, take the next value from the original list.
  5. Again, check if this value is already in the unique values list.
  6. If it's not there, add it.
  7. Repeat this process for every value in the original list, comparing each one to the unique values list and adding it only if it's new.
  8. At the end, the unique values list will contain all the unique values from the original list, with duplicates removed.

Code Implementation

def remove_duplicates_brute_force(original_list):
    unique_values_list = []

    for current_value in original_list:
        # Check if the current value is already present
        # in the list of unique values.
        if current_value not in unique_values_list:

            # Only add the current value if it's not
            # already in the unique values list.
            unique_values_list.append(current_value)

    return unique_values_list

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through the input array of size n. For each element, it checks if that element is already present in the list of unique values. Checking for the presence of an element in the unique values list, which can grow up to size n, requires iterating through that list. Therefore, for each of the n elements in the input, we potentially perform a search operation of up to n steps. This results in approximately n * n operations, which simplifies to O(n²).
Space Complexity
O(N)The provided algorithm constructs a new list, called the 'unique values list', to store the unique elements encountered in the original list. In the worst-case scenario, where all elements in the original list are unique, this new list will contain all N elements from the original list. Thus, the auxiliary space used by the algorithm scales linearly with the input size N. Therefore, the space complexity is O(N).

Optimal Solution

Approach

When you have items in order and want to remove the repeating ones, you can do it efficiently by keeping track of the last unique item you found. This helps you overwrite duplicates with new unique items as you go, using the existing space instead of creating more.

Here's how the algorithm would work step-by-step:

  1. Start by assuming the very first item is unique and worth keeping.
  2. Move through the rest of the items one by one.
  3. For each item, check if it's different from the last unique item you kept.
  4. If it's different, then this is a new unique item. Place it right after the last unique item you kept, essentially overwriting a duplicate.
  5. Keep repeating this until you have checked all the items.
  6. The number of unique items you kept is now the length of the modified list.

Code Implementation

def remove_duplicates_from_sorted_array(nums):
    if not nums:
        return 0

    # Initialize the index of the last unique element.
    last_unique_index = 0

    for i in range(1, len(nums)):
        # Check if the current element is different
        # from the last unique element.
        if nums[i] != nums[last_unique_index]:

            # If it's different, move it to the next
            # position after the last unique element.
            last_unique_index += 1
            nums[last_unique_index] = nums[i]

    # The length of the array with duplicates removed.
    return last_unique_index + 1

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input array of size n exactly once. For each element, it performs a constant-time comparison to determine if it is a duplicate of the last unique element encountered. The operations inside the loop are all constant time, such as the comparison and potential assignment. Therefore, the time complexity is directly proportional to the number of elements in the array, resulting in O(n) complexity.
Space Complexity
O(1)The provided solution operates in place, modifying the original array directly. It only uses a few integer variables to keep track of the index of the last unique element found and the current element being examined. These variables consume a constant amount of extra memory, irrespective of the input array's size, N. Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Empty input arrayReturn 0, indicating no unique elements.
Input array with only one elementReturn 1, as the single element is unique.
Input array with all elements being the sameThe algorithm should still correctly identify only one unique element and return 1.
Input array with a mix of duplicates and unique elementsThe algorithm should correctly identify and retain the unique elements in their original order.
Input array containing negative numbers, positive numbers, and zeroThe algorithm should work correctly regardless of the sign or value of the numbers.
Large input array with many duplicates to test scalabilityThe solution should aim for O(n) time complexity to handle large arrays efficiently.
Input array with extreme boundary values (e.g., Integer.MAX_VALUE, Integer.MIN_VALUE)The solution should handle boundary values without causing integer overflow issues.
Input array that is already completely uniqueThe algorithm should correctly identify that all elements are unique and return the original length.