Taro Logo

First Missing Positive

Hard
Meta logo
Meta
1 view
Topics:
Arrays

Given an unsorted integer array nums. Return the smallest positive integer that is not present in nums.

You must implement an algorithm that runs in O(n) time and uses O(1) auxiliary space.

For example:

  1. nums = [1,2,0] The numbers in the range [1,2] are all in the array. Expected output is 3.

  2. nums = [3,4,-1,1] 1 is in the array but 2 is missing. Expected output is 2.

  3. nums = [7,8,9,11,12] The smallest positive integer 1 is missing. Expected output is 1.

Can you provide an implementation and explain the complexities?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the input array `nums` contain negative numbers, zeros, or duplicates?
  2. What is the expected range of values in the input array, and what is the maximum size of the array?
  3. If the array is empty or contains no positive integers, what value should I return?
  4. Is the missing positive integer guaranteed to be within a reasonable range (e.g., no need to handle extremely large values)?
  5. Could you provide a few example inputs and their corresponding expected outputs to ensure my understanding of the problem?

Brute Force Solution

Approach

Let's say we're looking for the smallest missing positive number in a group. The brute force way means we check every possibility, starting from the smallest positive number.

Here's how the algorithm would work step-by-step:

  1. Begin by checking if the number 1 is present in the group.
  2. If 1 is missing, then 1 is our answer.
  3. If 1 is present, move on and check if the number 2 is present.
  4. If 2 is missing, then 2 is our answer.
  5. Continue this process, checking 3, then 4, then 5, and so on.
  6. Stop when you find the first positive number that is not in the group. That number is the smallest missing positive.

Code Implementation

def first_missing_positive_brute_force(numbers):
    smallest_positive = 1

    while True:
        # Iterate until a missing positive is found.

        if smallest_positive not in numbers:
            # If smallest_positive is not in the list, return it.

            return smallest_positive

        # Increment to check the next positive integer.
        smallest_positive += 1

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates, potentially up to n+1 times, to find the smallest missing positive. In each iteration, it checks if a particular positive integer 'i' exists in the input array of size n. Checking for the presence of 'i' involves iterating through the entire array, which takes O(n) time. Since we repeat this search up to n+1 times, the overall time complexity becomes (n+1) * O(n). This simplifies to O(n²), as we drop the constant term and coefficient.
Space Complexity
O(1)The provided algorithm checks for the existence of positive integers 1, 2, 3, and so on, directly within the input group. No auxiliary data structures like sets, maps, or lists are created to store intermediate results or track visited numbers. Only a constant amount of space is required to store the current number being checked (e.g., the '1', '2', '3', etc.) as well as potentially loop counters, irrespective of the input size N (the number of elements in the group). Therefore, the auxiliary space complexity is constant.

Optimal Solution

Approach

The key idea is to rearrange the numbers within the original list such that the number at each position matches the position itself (plus one). This way, the first missing positive integer is simply the first position where the number doesn't match. After rearranging, we do a quick sweep to find the missing number.

Here's how the algorithm would work step-by-step:

  1. First, ignore any numbers that are negative or greater than the size of the list, because they can't be the first missing positive.
  2. Now, go through the list. For each number you see, try to put it in its correct spot. For example, the number '3' should go in the third position.
  3. If the spot is already taken by the correct number or a number outside the valid range, then you can move on to the next number in the list.
  4. After you've tried to put every number in its place, go through the list one more time. The first position where the number isn't what it should be (i.e., the number isn't equal to its position plus one) is where you've found the first missing positive number.
  5. If you get to the end of the list and all numbers are in the right places, the first missing positive number is simply the size of the list plus one.

Code Implementation

def first_missing_positive(numbers):
    list_size = len(numbers)

    for i in range(list_size):
        while 0 < numbers[i] <= list_size and numbers[i] != numbers[numbers[i] - 1]:
            # Correct position calculation using index and value
            correct_position = numbers[i] - 1
            numbers[i], numbers[correct_position] = numbers[correct_position], numbers[i]

    for i in range(list_size):
        # Find the first index where the number is not in its correct place
        if numbers[i] != i + 1:
            return i + 1

    # If all numbers are in place, the missing positive is the next integer.
    return list_size + 1

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input array of size n. Inside the loop, a while loop is used to place elements in their correct positions via swaps. Crucially, each number is swapped into its correct position at most once, because subsequent swaps in the while loop involve placing numbers in positions where they belong. Since each element is moved at most once, the total number of swaps and therefore while loop iterations across all iterations of the outer loop is at most O(n). Finally, a single loop is performed through the array again, in O(n) time. Thus the overall time complexity is O(n).
Space Complexity
O(1)The algorithm operates directly on the input array, modifying it in place. No additional data structures like temporary lists or hash maps are used. The algorithm only utilizes a few integer variables for indexing and swapping elements, the memory required for which remains constant irrespective of the input array's size (N). Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Empty input arrayReturn 1, as it's the smallest missing positive when there are no positive integers in the input.
Input array contains only negative numbers and/or zerosReturn 1, as no positive numbers are present and 1 is the smallest missing positive.
Input array contains only the number 1Return 2, as 1 is present and 2 is the next smallest positive.
Input array contains the sequence 1, 2, 3, ..., nReturn n+1, as all numbers from 1 to n are present, so n+1 is missing.
Input array contains Integer.MAX_VALUEThe algorithm should handle large positive integers without overflow, ensuring correct processing.
Input array with duplicates, including multiple occurrences of 1Duplicates are handled correctly as we only need to check for the presence of numbers in the range [1, n], not their counts.
Input array with both positive and negative numbers interspersedThe algorithm should correctly ignore negative numbers and focus on identifying the smallest missing positive within the range [1, n].
Input array containing large gaps between positive numbers (e.g., [1, 100000])The algorithm should work correctly even with large gaps, focusing on the presence of numbers in the range [1, n], where n is the array length.