Remove Duplicates from Sorted Array

Easy

Asked by:

+13

More companies

142 views

Topics:

ArraysTwo Pointers

Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place such that each unique element appears only once. The relative order of the elements should be kept the same. Then return the number of unique elements in nums.

Consider the number of unique elements of nums to be k, to get accepted, you need to do the following things:

Change the array nums such that the first k elements of nums contain the unique elements in the order they were present in nums initially. The remaining elements of nums are not important as well as the size of nums.
Return k.

Custom Judge:

The judge will test your solution with the following code:

int[] nums = [...]; // Input array
int[] expectedNums = [...]; // The expected answer with correct length

int k = removeDuplicates(nums); // Calls your implementation

assert k == expectedNums.length;
for (int i = 0; i < k; i++) {
    assert nums[i] == expectedNums[i];
}

If all assertions pass, then your solution will be accepted.

Example 1:

Input: nums = [1,1,2]
Output: 2, nums = [1,2,_]
Explanation: Your function should return k = 2, with the first two elements of nums being 1 and 2 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Example 2:

Input: nums = [0,0,1,1,1,2,2,3,3,4]
Output: 5, nums = [0,1,2,3,4,_,_,_,_,_]
Explanation: Your function should return k = 5, with the first five elements of nums being 0, 1, 2, 3, and 4 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Constraints:

1 <= nums.length <= 3 * 10⁴
-100 <= nums[i] <= 100
nums is sorted in non-decreasing order.

Try coding it in LeetCode

Remove Duplicates from Sorted Array

Easy

Asked by:

+7

More companies

142 views

Topics:

ArraysTwo Pointers

Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place such that each unique element appears only once. The relative order of the elements should be kept the same. Then return the number of unique elements in nums.

Consider the number of unique elements of nums to be k, to get accepted, you need to do the following things:

Change the array nums such that the first k elements of nums contain the unique elements in the order they were present in nums initially. The remaining elements of nums are not important as well as the size of nums.
Return k.

Custom Judge:

The judge will test your solution with the following code:

int[] nums = [...]; // Input array
int[] expectedNums = [...]; // The expected answer with correct length

int k = removeDuplicates(nums); // Calls your implementation

assert k == expectedNums.length;
for (int i = 0; i < k; i++) {
    assert nums[i] == expectedNums[i];
}

If all assertions pass, then your solution will be accepted.

Example 1:

Input: nums = [1,1,2]
Output: 2, nums = [1,2,_]
Explanation: Your function should return k = 2, with the first two elements of nums being 1 and 2 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Example 2:

Input: nums = [0,0,1,1,1,2,2,3,3,4]
Output: 5, nums = [0,1,2,3,4,_,_,_,_,_]
Explanation: Your function should return k = 5, with the first five elements of nums being 0, 1, 2, 3, and 4 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Constraints:

1 <= nums.length <= 3 * 10⁴
-100 <= nums[i] <= 100
nums is sorted in non-decreasing order.

Try coding it in LeetCode

Solution

Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

What should be the result for an empty input array? Should I return 0?
The problem mentions modifying the array 'in-place'. What should the values be in the part of the array beyond the returned length of unique elements?
Does the sorted array contain only integers, or can there be other numeric types? Can the integers be negative?
Is the array guaranteed to be sorted, or should I handle cases where it might not be?
How large can the array `nums` be? Are there any memory constraints I should be aware of, given the in-place modification requirement?

Brute Force Solution

Approach

To get rid of the duplicate numbers in a sorted list, the brute force strategy is to create a brand new list. We'll look at each number from the original list one by one and decide if it's unique enough to be added to our new list.

Here's how the algorithm would work step-by-step:

Start by creating a new, empty list to hold our unique numbers.
Look at the very first number in the original list and add it to the new list, since there are no other numbers to compare it to yet.
Move to the next number in the original list.
Compare this number to the most recent number we just added to our new list.
If the current number is different from the last one we added, it's a new unique number, so add it to the end of our new list.
If the current number is the same as the last one we added, it's a duplicate, so we just ignore it and do nothing.
Continue this process, moving through every number in the original list and comparing it to the last one we placed in the new list, until you've checked them all.
The new list now contains all the original numbers but with no duplicates.

Code Implementation

def remove_duplicates_brute_force(sorted_numbers):
    # Handle the edge case of an empty list to prevent index errors.

    if not sorted_numbers:
        return []

    # Initialize the new list with the first element, as it's always unique initially.

    unique_numbers = [sorted_numbers[0]]

    # Iterate through the original list, starting from the second element.

    for current_index in range(1, len(sorted_numbers)):
        # Check if the current number is a new unique value compared to the last one added.

        if sorted_numbers[current_index] != unique_numbers[-1]:
            unique_numbers.append(sorted_numbers[current_index])

    return unique_numbers

Big(O) Analysis

Time Complexity

O(n) – The time complexity is determined by iterating through the original list once. Let n be the number of elements in the list. We perform a single pass from the first to the last element. During this pass, we do a constant amount of work for each element: a comparison and potentially an append operation to the new list. Because we visit each of the n elements exactly one time, the total number of operations grows linearly with the size of the input. This results in a time complexity of O(n).

Space Complexity

O(N) – The algorithm's space complexity is determined by the new list created to hold the unique numbers. In the worst-case scenario, where the input array contains no duplicates, this new list will grow to the same size as the original input array. Therefore, the auxiliary space required is directly proportional to the number of elements, N, in the input list.

Optimal Solution

Approach

The key insight is to take advantage of the sorted nature of the numbers. Since all identical numbers are grouped together, we can process the collection in a single pass, keeping a separate section at the beginning for the unique numbers we've found so far.

Here's how the algorithm would work step-by-step:

Imagine you have two pointers, or fingers, both starting at the beginning of the collection of numbers.
One finger, let's call it the 'placement' finger, marks the end of the section containing only unique numbers found so far. This section starts with just the first number.
The second finger, the 'seeker', will travel through the entire collection from start to finish to find the next unique number.
As the seeker moves along, it compares the number it's currently on with the last unique number identified by the placement finger.
If the seeker finds a number that is different from the last unique one, it means we've discovered a new unique number.
This new unique number is then placed right after the last one in our special unique section, and the placement finger moves forward to include it.
The seeker continues its journey, ignoring all subsequent identical numbers until it finds the next new one.
By the time the seeker has looked at every number in the original collection, the special section at the beginning will contain all the unique numbers in order, and the count of those numbers is our answer.

Code Implementation

def remove_duplicates(sorted_numbers):
    if not sorted_numbers:
        return 0

    # The placement finger starts after the first element, which is always unique.

    placement_index = 1

    # The seeker finger iterates through the list to find the next unique number.

    for seeker_index in range(1, len(sorted_numbers)):
        # A new unique number is found if it's different from the last one recorded.

        if sorted_numbers[seeker_index] != sorted_numbers[placement_index - 1]:
            # This new unique number is placed at the next available unique slot.

            sorted_numbers[placement_index] = sorted_numbers[seeker_index]
            placement_index += 1

    return placement_index

Big(O) Analysis

Time Complexity

O(n) – The time complexity is determined by the 'seeker' pointer, which must traverse the entire collection of numbers once to identify all unique elements. Let n be the number of elements in the input array. Since the seeker examines each of the n elements exactly one time, the total number of operations is directly proportional to n. The 'placement' pointer only moves when a unique element is found but does not add a separate loop or nested operations. Therefore, the complexity simplifies to a linear O(n).

Space Complexity

O(1) – The algorithm operates in-place on the input collection and does not create any new data structures. The only extra memory used is for the two pointers, the 'placement' finger and the 'seeker' finger. The amount of memory required for these pointers is constant and does not scale with the size of the input collection, N. Therefore, the auxiliary space complexity is constant.

Edge Cases

Case	How to Handle
Input array is empty	The solution should handle this gracefully by returning 0, as there are no elements to process.
Input array contains only one element	The algorithm correctly returns 1, as a single-element array has no duplicates.
All elements in the array are identical	The two-pointer approach correctly processes the entire array and returns 1, leaving just the first element.
No duplicate elements exist in the array	The solution will correctly iterate through the array, find no duplicates, and return the original array length.
Array contains negative numbers and zeros	The comparison logic works identically for negative numbers, zeros, and positive numbers, so no special handling is needed.
Duplicates appear at the beginning or end of the array	The two-pointer method is robust enough to handle duplicate clusters regardless of their position in the sorted array.
Very large input array near memory or time limits	The in-place O(N) time and O(1) space solution is highly efficient and scales well for large inputs without issues.
Array contains integer boundary values like INT_MIN or INT_MAX	Standard integer comparisons work correctly with boundary values, so the algorithm handles them without modification.

Input array is empty

How to Handle:

The solution should handle this gracefully by returning 0, as there are no elements to process.

Input array contains only one element

How to Handle:

The algorithm correctly returns 1, as a single-element array has no duplicates.

All elements in the array are identical

How to Handle:

The two-pointer approach correctly processes the entire array and returns 1, leaving just the first element.

No duplicate elements exist in the array

How to Handle:

The solution will correctly iterate through the array, find no duplicates, and return the original array length.

Array contains negative numbers and zeros

How to Handle:

The comparison logic works identically for negative numbers, zeros, and positive numbers, so no special handling is needed.

Duplicates appear at the beginning or end of the array

How to Handle:

The two-pointer method is robust enough to handle duplicate clusters regardless of their position in the sorted array.

Very large input array near memory or time limits

How to Handle:

The in-place O(N) time and O(1) space solution is highly efficient and scales well for large inputs without issues.

Array contains integer boundary values like INT_MIN or INT_MAX

How to Handle:

Standard integer comparisons work correctly with boundary values, so the algorithm handles them without modification.