Find Missing and Repeated Values

Easy

Asked by:

62 views

Topics:

ArraysBit Manipulation

You are given a 0-indexed 2D integer matrix grid of size n * n with values in the range [1, n²]. Each integer appears exactly once except a which appears twice and b which is missing. The task is to find the repeating and missing numbers a and b.

Return a 0-indexed integer array ans of size 2 where ans[0] equals to a and ans[1] equals to b.

Example 1:

Input: grid = [[1,3],[2,2]]
Output: [2,4]
Explanation: Number 2 is repeated and number 4 is missing so the answer is [2,4].

Example 2:

Input: grid = [[9,1,7],[8,9,2],[3,4,6]]
Output: [9,5]
Explanation: Number 9 is repeated and number 5 is missing so the answer is [9,5].

Constraints:

2 <= n == grid.length == grid[i].length <= 50
1 <= grid[i][j] <= n * n
For all x that 1 <= x <= n * n there is exactly one x that is not equal to any of the grid members.
For all x that 1 <= x <= n * n there is exactly one x that is equal to exactly two of the grid members.
For all x that 1 <= x <= n * n except two of them there is exactly one pair of i, j that 0 <= i, j <= n - 1 and grid[i][j] == x.

Try coding it in LeetCode

Solution

Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

What is the range of numbers within the array? Can I assume they are all positive integers?
Is the input array guaranteed to contain only positive integers, and is there guaranteed to be exactly one missing and one repeated number?
What should I return if the input is empty or null? Is that a possible input?
Should I return the missing and repeated values as a list, tuple, or some other specific data structure?
Is the array sorted, or can the elements be in any order?

Brute Force Solution

Approach

The brute force approach to this problem is like manually checking every possible number to see if it's the missing or repeated one. We'll compare each number with every other number until we find the pair we're looking for.

Here's how the algorithm would work step-by-step:

First, let's assume that the number 1 is missing and check if it appears more than once in the list. If it does, we've found our repeated number.
Next, assume that the number 2 is missing and again check if it appears more than once in the list. If it does, we've found our repeated number.
We continue this process, assuming each number (3, 4, 5, and so on) up to the maximum possible number could be the missing one and look for a repeated number each time.
If, after checking every number, we don't find a repeated number for any of the 'missing' numbers we tried, then our assumption of what is missing must be incorrect.
Now, let’s try a different approach. For each number, we count how many times it appears. If a number appears more than once, we've found the repeated number.
Once we've identified the repeated number, we know which number is missing because the sum of the numbers should match our expectation.
By finding the repeated number and using the expected sum, we can calculate the missing number.

Code Implementation

def find_missing_and_repeated_values(number_list):
    list_length = len(number_list)

    for possible_missing_number in range(1, list_length + 1):
        counts = {}
        repeated_number_found = False

        for number in number_list:
            if number in counts:
                counts[number] += 1
                repeated_number = number
                repeated_number_found = True

                break
            else:
                counts[number] = 1

        # If a repeated number is found after assuming a missing number, return the pair.

        if repeated_number_found == True:
            missing_number = possible_missing_number
            return [repeated_number, missing_number]

    # Need to check for repeats, as first loop didn't produce result.
    counts = {}
    repeated_number = -1

    for number in number_list:
        if number in counts:
            repeated_number = number
            break
        else:
            counts[number] = 1

    # Find the actual sum and compare it with expected.
    actual_sum = sum(number_list)
    expected_sum = (list_length * (list_length + 1)) // 2
    missing_number = repeated_number + (expected_sum - actual_sum)

    # Find the missing number with the expected sum versus the actual sum
    return [repeated_number, missing_number]

Big(O) Analysis

Time Complexity

O(n²) – The algorithm iterates through each number from 1 to n, assuming it's missing. For each of these n potential missing numbers, it scans the entire array of size n to check for a repeated number. This results in nested loops, where the outer loop iterates up to n times and the inner loop also iterates up to n times. Therefore, the number of operations approximates to n * n, simplifying to a time complexity of O(n²).

Space Complexity

O(1) – The provided plain English explanation doesn't utilize any auxiliary data structures like arrays, hash maps, or trees. It primarily involves iterative comparisons and calculations. Therefore, it only requires a fixed number of variables to store intermediate values like counters or temporary results, irrespective of the input size N. This constant auxiliary space usage translates to O(1) space complexity.

Optimal Solution

Approach

The trick here is to use the number itself to help us find the missing and repeated values. Instead of brute force searching, we'll cleverly use math to solve the problem directly, avoiding extra checks and comparisons.

Here's how the algorithm would work step-by-step:

First, calculate the sum of all numbers from 1 to N, where N is the size of one side of the square.
Then, calculate the sum of all the numbers actually present in the square.
The difference between these two sums will give you a relationship between the missing and repeated numbers.
Next, calculate the sum of the squares of all numbers from 1 to N.
Also, calculate the sum of the squares of all the numbers actually present in the square.
The difference between these two sums will give you another relationship between the missing and repeated numbers. Now you have two equations!
Solve these two equations to find the values of the missing and repeated numbers.

Code Implementation

def find_missing_and_repeated_values(grid):
    grid_size = len(grid)
    total_numbers = grid_size * grid_size

    # Calculate the sum of numbers from 1 to N.
    expected_sum = (total_numbers * (total_numbers + 1)) // 2

    actual_sum = 0
    actual_sum_of_squares = 0

    for row in grid:
        for number in row:
            actual_sum += number
            actual_sum_of_squares += number * number

    # Calculate the sum of squares from 1 to N.
    expected_sum_of_squares = (total_numbers * (total_numbers + 1) * (2 * total_numbers + 1)) // 6

    # This difference gives a relation between missing and repeated numbers.
    difference_sum = actual_sum - expected_sum
    difference_sum_squares = actual_sum_of_squares - expected_sum_of_squares

    # Solve two equations:
    # repeatedNumber - missingNumber = difference_sum
    # repeatedNumber^2 - missingNumber^2 = difference_sum_squares

    sum_values = difference_sum_squares // difference_sum

    repeated_number = (difference_sum + sum_values) // 2
    missing_number = sum_values - repeated_number

    return [repeated_number, missing_number]

Big(O) Analysis

Time Complexity

O(n) – The algorithm calculates the sum and sum of squares of numbers from 1 to N, which are constant time operations. It then iterates through the given square (matrix) to calculate the actual sum and sum of squares of the elements in the square. Since we iterate through each element of the NxN square once, where N is the size of a side, we are effectively processing n*n elements. However, N = n, so that is n*n elements. Each step within the loops, like addition or squaring, take constant time. Therefore, the overall time complexity is O(n).

Space Complexity

O(1) – The algorithm primarily uses a fixed number of variables to store the sums and intermediate calculations. Specifically, it calculates sums based on the input square's side length (N) and the values within the square, but these sums are stored in a constant number of variables. No auxiliary data structures like arrays or hash maps are created that scale with the input size N. Therefore, the space used is independent of the input size N, resulting in constant auxiliary space complexity.

Edge Cases

Case	How to Handle
Null input array	Throw IllegalArgumentException or return null after checking for null input to avoid NullPointerException.
Array with single element	Return empty list as no repeated or missing numbers can exist when the size is 1.
Array with all identical values	The repeated value will be that identical value, and the missing value can be determined by comparing with the expected sequence.
Array with numbers outside the range [1, n]	Return an error or adjust the expected range to match the actual numbers in the array if out of range numbers are allowed.
Maximum sized input array	Ensure algorithm's time and space complexity remains acceptable; an in-place solution minimizes space usage.
Large n values causing potential integer overflow when summing	Use long data types for sums or consider alternative approaches like using a bit vector if applicable.
Array contains only the missing number (repeated number is 1)	The algorithm should correctly identify 1 as the repeated number and the missing number present in the array.
Array with negative numbers	Adjust the algorithm to handle negative numbers, potentially by offsetting the input array to a positive range, or return an error if negatives are not allowed.

Null input array

How to Handle:

Throw IllegalArgumentException or return null after checking for null input to avoid NullPointerException.

Array with single element

How to Handle:

Return empty list as no repeated or missing numbers can exist when the size is 1.

Array with all identical values

How to Handle:

The repeated value will be that identical value, and the missing value can be determined by comparing with the expected sequence.

Array with numbers outside the range [1, n]

How to Handle:

Return an error or adjust the expected range to match the actual numbers in the array if out of range numbers are allowed.

Maximum sized input array

How to Handle:

Ensure algorithm's time and space complexity remains acceptable; an in-place solution minimizes space usage.

Large n values causing potential integer overflow when summing

How to Handle:

Use long data types for sums or consider alternative approaches like using a bit vector if applicable.

Array contains only the missing number (repeated number is 1)

How to Handle:

The algorithm should correctly identify 1 as the repeated number and the missing number present in the array.

Array with negative numbers

How to Handle:

Adjust the algorithm to handle negative numbers, potentially by offsetting the input array to a positive range, or return an error if negatives are not allowed.