Taro Logo

Intersection of Two Arrays

Easy
Google logo
Google
3 views
Topics:
ArraysTwo Pointers

Given two integer arrays nums1 and nums2, return an array of their intersection. Each element in the result must be unique and you may return the result in any order.

For example:

nums1 = [1,2,2,1], nums2 = [2,2] Output: [2]

nums1 = [4,9,5], nums2 = [9,4,9,8,4] Output: [9,4] (or [4,9])

Write a function to find the intersection of two integer arrays, ensuring that the elements in the result are unique and can be returned in any order. Consider the cases where the input arrays might be empty or contain duplicate elements. Discuss the time and space complexity of your solution. Can you think of multiple approaches and what the trade offs are?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges for the integers within the arrays?
  2. Can either of the input arrays be empty or null?
  3. If there is no intersection between the two arrays, what should I return?
  4. If an element appears multiple times in both arrays, how many times should it appear in the output?
  5. What is the expected maximum size of the input arrays?

Brute Force Solution

Approach

To find common numbers between two lists, the brute force way is like comparing each number from the first list with every number in the second list. We continue this process until we've checked all possible pairs of numbers from both lists.

Here's how the algorithm would work step-by-step:

  1. Take the first number from the first list.
  2. Compare this number with every single number in the second list.
  3. If you find a match, write down that number.
  4. Move to the next number in the first list.
  5. Again, compare this new number with every number in the second list.
  6. If you find a match, and you haven't already written it down, write it down.
  7. Keep repeating this process, going through each number in the first list and comparing it to all numbers in the second list.
  8. At the end, you'll have a list of all the numbers that appear in both the first and second lists.

Code Implementation

def intersection_of_two_arrays_brute_force(first_array, second_array):
    intersection_result = []

    for first_array_element in first_array:
        # Iterate through the second array to find matches for current element.

        for second_array_element in second_array:
            if first_array_element == second_array_element:
                # Ensure we only add distinct elements to the intersection.

                if first_array_element not in intersection_result:
                    intersection_result.append(first_array_element)

    return intersection_result

Big(O) Analysis

Time Complexity
O(n²)The provided algorithm iterates through each element of the first array (let's say of size n) and, for each of these elements, iterates through every element of the second array (also assumed to be size n in the worst-case scenario) to check for equality. This results in a nested loop structure where each of the n elements in the first array is potentially compared with all n elements in the second array. Therefore, the number of operations grows proportionally to n * n. This means that the time complexity is O(n²).
Space Complexity
O(N)The described brute force approach requires storing the common numbers found in an auxiliary data structure to avoid duplicates. In the worst-case scenario, all the elements of the first array are present in the second array, leading to an auxiliary array (the 'written down' list) that can grow up to the size of the first array. If we assume that the size of the first array is N, then the auxiliary space used is proportional to N. This leads to a space complexity of O(N).

Optimal Solution

Approach

The most efficient way to find common elements between two collections is to use a way of organizing one collection that allows for very quick lookups. We can then iterate through the second collection and quickly check if each element exists in the first collection's special organization. This avoids comparing each element in one collection to every element in the other collection.

Here's how the algorithm would work step-by-step:

  1. Take the first collection and organize its elements into a special lookup structure that provides very fast existence checks.
  2. Go through each element in the second collection, one at a time.
  3. For each element in the second collection, check if it exists in the organized lookup structure of the first collection.
  4. If an element from the second collection is found in the lookup structure, add it to a result collection of common elements.
  5. Once all elements from the second collection have been checked, the result collection will contain all the elements that are common to both collections.

Code Implementation

def intersection_of_two_arrays(first_array, second_array):
    first_array_set = set()
    for element in first_array:
        first_array_set.add(element)

    intersection_result = []

    # Use the set for quick lookups.
    for element in second_array:
        if element in first_array_set:
            intersection_result.append(element)
            # Remove the element to avoid duplicates
            first_array_set.remove(element)

    return intersection_result

def intersection_of_two_arrays_alt(first_array, second_array):
    # Use a set for quick lookups of elements in first_array
    first_array_set = set(first_array)

    intersection_result = []

    # Iterate through the second array and check for presence in the set.
    for element in second_array:
        if element in first_array_set:
            intersection_result.append(element)
            first_array_set.discard(element)

    return intersection_result

def intersection_of_two_arrays_optimized(first_array, second_array):
    # Convert the first array to a set for O(1) lookups
    first_array_set = set(first_array)

    intersection_result = []

    # Iterate through the second array
    for element in second_array:
        # Only add to result if element is in the first array set.
        if element in first_array_set:

            intersection_result.append(element)
            # To avoid duplicates, remove the element from the set
            first_array_set.remove(element)

    return intersection_result

Big(O) Analysis

Time Complexity
O(n + m)Let n be the size of the first array and m be the size of the second array. Step 1 involves creating a lookup structure (e.g., a hash set) from the first array, which takes O(n) time. Steps 2 and 3 iterate through the second array (size m) and perform a constant-time lookup in the hash set for each element. This takes O(m) time. Step 4 is conditional and doesn't change the overall complexity. Therefore, the total time complexity is O(n + m).
Space Complexity
O(N)The algorithm uses a lookup structure (likely a hash set or dictionary) to store elements from the first collection. In the worst-case scenario, all elements of the first collection are unique, requiring storage for N elements, where N is the number of elements in the first collection. Additionally, a result collection is used to store the common elements. In the worst case, this result collection could also hold up to N elements if all elements of the second collection are present in the first. Therefore, the auxiliary space is proportional to N, resulting in O(N) space complexity.

Edge Cases

CaseHow to Handle
One or both input arrays are null or emptyReturn an empty array if either nums1 or nums2 is null or empty.
One array is significantly larger than the otherUse the smaller array to build the frequency map to optimize memory usage.
Both arrays contain only one elementCompare the single elements and return the array containing the element if they are equal or an empty array otherwise.
Arrays contain large positive or negative integers near the integer limitsThe chosen solution (hash map/dictionary approach) should handle large integer values without causing overflow unless integer limits are exceeded during element counting; consider using 64 bit integers if necessary.
Arrays contain duplicate elements with differing frequenciesUse a hash map (or similar) to count element occurrences, and decrement the counts during intersection to ensure the correct frequency of intersection elements.
No intersection exists between the two arraysThe algorithm should return an empty array when no common elements are found.
All elements in both arrays are the sameThe result should contain the element repeated the number of times it appears in the shorter array.
Input arrays are extremely large and do not fit into memoryConsider using an external sort approach where the arrays are sorted on disk, and then a merge-like algorithm is used to find the intersection using chunks of data.