Taro Logo

Find the Difference of Two Arrays

Easy
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+2
More companies
Profile picture
Profile picture
57 views
Topics:
Arrays

Given two 0-indexed integer arrays nums1 and nums2, return a list answer of size 2 where:

  • answer[0] is a list of all distinct integers in nums1 which are not present in nums2.
  • answer[1] is a list of all distinct integers in nums2 which are not present in nums1.

Note that the integers in the lists may be returned in any order.

Example 1:

Input: nums1 = [1,2,3], nums2 = [2,4,6]
Output: [[1,3],[4,6]]
Explanation:
For nums1, nums1[1] = 2 is present at index 0 of nums2, whereas nums1[0] = 1 and nums1[2] = 3 are not present in nums2. Therefore, answer[0] = [1,3].
For nums2, nums2[0] = 2 is present at index 1 of nums1, whereas nums2[1] = 4 and nums2[2] = 6 are not present in nums1. Therefore, answer[1] = [4,6].

Example 2:

Input: nums1 = [1,2,3,3], nums2 = [1,1,2,2]
Output: [[3],[]]
Explanation:
For nums1, nums1[2] and nums1[3] are not present in nums2. Since nums1[2] == nums1[3], their value is only included once and answer[0] = [3].
Every integer in nums2 is present in nums1. Therefore, answer[1] = [].

Constraints:

  • 1 <= nums1.length, nums2.length <= 1000
  • -1000 <= nums1[i], nums2[i] <= 1000

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected range of values for the integers in the arrays? Can they be negative, zero, or very large numbers?
  2. What are the constraints on the length of the input arrays? Can they be empty?
  3. The problem asks for 'distinct' integers in the output. Does this mean if an integer appears multiple times in `nums1` but not in `nums2`, it should only appear once in the final result list?
  4. Is there any specific order required for the integers within the two output lists?
  5. Just to confirm the output format, should I return a list containing exactly two lists, even if one or both of them are empty?

Brute Force Solution

Approach

The goal is to find which numbers are unique to each of two lists. The brute force method involves taking each number from the first list and checking it against every single number in the second list, and then doing the same thing in reverse for the second list.

Here's how the algorithm would work step-by-step:

  1. First, let's find the numbers that are in the first list but not in the second. To do this, pick the first number from the first list.
  2. Compare this chosen number to every single number in the second list, one by one.
  3. If you go through the entire second list and never find a match for your chosen number, then that number is unique to the first list. Keep it aside.
  4. Repeat this process for every number in the first list: pick a number, compare it against all numbers in the second list, and save it if it's never found.
  5. Next, we need to find the numbers that are in the second list but not in the first. So, pick the first number from the second list.
  6. Compare this number to every single number in the first list.
  7. If you search through the whole first list and don't find a match, that means the number is unique to the second list. Keep it aside in a separate collection.
  8. Continue this for every number in the second list until you have checked them all.
  9. Finally, present the two collections of unique numbers you've gathered as the answer.

Code Implementation

def find_difference_brute_force(first_list, second_list):
    unique_to_first_list = []
    unique_to_second_list = []

    # Iterate through the first list to find numbers not present in the second list.

    for first_list_number in first_list:
        is_number_found_in_second_list = False
        for second_list_number in second_list:
            if first_list_number == second_list_number:
                is_number_found_in_second_list = True
                break

        # If the number was not found after checking all elements, it's unique to the first list.

        if not is_number_found_in_second_list:
            if first_list_number not in unique_to_first_list:
                unique_to_first_list.append(first_list_number)

    # Now, iterate through the second list to find numbers not present in the first list.

    for second_list_number in second_list:
        is_number_found_in_first_list = False
        for first_list_number in first_list:
            if second_list_number == first_list_number:
                is_number_found_in_first_list = True
                break
        
        # Similarly, if the number was not found, it is unique to the second list.

        if not is_number_found_in_first_list:
            if second_list_number not in unique_to_second_list:
                unique_to_second_list.append(second_list_number)

    return [unique_to_first_list, unique_to_second_list]

Big(O) Analysis

Time Complexity
O(n * m)The time complexity is determined by the nested comparisons between the two lists. Let's say the first list has 'n' elements and the second has 'm' elements. To find numbers unique to the first list, the algorithm iterates through each of its 'n' elements, and for each one, it performs a full scan of the 'm' elements in the second list, resulting in n * m operations. This process is then repeated in reverse, comparing each of the 'm' elements from the second list against all 'n' elements of the first, adding another m * n operations. The total operations are approximately (n * m) + (m * n), which simplifies to O(n * m).
Space Complexity
O(N + M)The algorithm creates two separate collections to store the unique numbers. The first collection holds numbers unique to the first list, and the second holds numbers unique to the second list. In the worst-case scenario, where the lists have no common elements, the first collection will store all N elements from the first list, and the second will store all M elements from the second list. Therefore, the auxiliary space required is directly proportional to the sum of the lengths of the two input lists, N and M.

Optimal Solution

Approach

The most efficient way to solve this is to quickly check which numbers are present in each group. By first making a unique collection of numbers for each of the two starting groups, we can avoid repetitive checks and instantly see if a number from one group exists in the other.

Here's how the algorithm would work step-by-step:

  1. First, create a unique collection of numbers from the first group. This makes it very fast to check if a number is present in it later.
  2. Do the same for the second group: create another unique collection of its numbers.
  3. Now, look at each unique number from the first group. For each one, see if it is missing from the second group's unique collection. If it is, add it to our first answer list.
  4. Finally, do the reverse. Look at each unique number from the second group. For each one, see if it is missing from the first group's unique collection. If it is, add it to our second answer list.
  5. The result is two lists: the first contains numbers only in the first original group, and the second contains numbers only in the second original group.

Code Implementation

class Solution:
    def findDifference(self, first_list_of_numbers: list[int], second_list_of_numbers: list[int]) -> list[list[int]]:
        # Use sets for efficient O(1) average time complexity lookups.

        unique_numbers_from_first_list = set(first_list_of_numbers)
        unique_numbers_from_second_list = set(second_list_of_numbers)

        distinct_in_first_list = []
        distinct_in_second_list = []

        # Iterate through the first set to find numbers not present in the second.

        for number in unique_numbers_from_first_list:
            if number not in unique_numbers_from_second_list:
                distinct_in_first_list.append(number)

        # Iterate through the second set to find numbers not present in the first.

        for number in unique_numbers_from_second_list:
            if number not in unique_numbers_from_first_list:
                distinct_in_second_list.append(number)

        return [distinct_in_first_list, distinct_in_second_list]

Big(O) Analysis

Time Complexity
O(n + m)Let n be the number of elements in the first array and m be the number of elements in the second. Creating the unique set for the first array takes O(n) time, and creating the set for the second takes O(m) time. Then, we iterate through the first set (at most n elements) and perform O(1) lookups in the second set, which costs O(n). Similarly, iterating through the second set (at most m elements) and looking up in the first set costs O(m). The total operations are O(n) + O(m) + O(n) + O(m), which simplifies to O(n + m).
Space Complexity
O(n + m)The space complexity is determined by the auxiliary data structures used to store unique elements from the input arrays. Step 1 creates a unique collection (a hash set) for the first group, which can store up to n elements, where n is the number of elements in the first array. Similarly, Step 2 creates a second hash set for the second group, storing up to m elements, where m is the number of elements in the second array. The two lists used to store the final answer also contribute to the space, holding at most n and m elements respectively. Therefore, the total auxiliary space scales linearly with the sum of the sizes of the two input arrays, resulting in O(n + m).

Edge Cases

One or both input arrays are empty.
How to Handle:
The solution should correctly return the distinct elements of the non-empty array in one list and an empty list for the other.
Both input arrays are identical, containing the same set of elements.
How to Handle:
The algorithm should return two empty lists as there is no difference between the arrays.
Input arrays contain duplicate values.
How to Handle:
Using sets naturally handles duplicates, ensuring the output lists contain only distinct integers.
Input arrays have no common elements.
How to Handle:
The result should be two lists containing all the distinct elements of the original arrays respectively.
Input arrays contain negative numbers and zeros.
How to Handle:
A hash set based solution correctly handles negative numbers and zeros just like any other integer.
One array is a complete subset of the other.
How to Handle:
One of the resulting lists will be empty, while the other contains elements from the larger set not present in the smaller subset.
Arrays are very large, approaching maximum constraints (e.g., 1000 elements).
How to Handle:
A solution using hash sets is efficient with an average time complexity of O(N+M), which scales well for large inputs.
Input values are at the extreme ends of the allowed integer range.
How to Handle:
Standard hash set implementations in most languages can manage the full range of integer values without issue.