Taro Logo

Intersection of Two Arrays II

Easy
Google logo
Google
4 views
Topics:
Arrays

Given two integer arrays nums1 and nums2, return an array of their intersection. Each element in the result must appear as many times as it shows in both arrays, and you may return the result in any order.

For example:

nums1 = [1, 2, 2, 1], nums2 = [2, 2] Output: [2, 2]

As another example:

nums1 = [4, 9, 5], nums2 = [9, 4, 9, 8, 4] Output: [4, 9] or [9, 4]

Can you implement a function to find the intersection of two arrays, handling duplicate elements correctly, and optimizing for time and space complexity?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges and data types of the integers within the input arrays?
  2. Are the input arrays guaranteed to be sorted, and if not, can I assume they are unsorted?
  3. If there are multiple intersections, can the output array contain duplicates with the same frequency as they appear in both input arrays?
  4. What should be returned if the arrays have no intersection?
  5. Is the order of the elements in the output array significant?

Brute Force Solution

Approach

The core idea is to compare each number from the first list with every number in the second list to find matches. We keep track of the matches we find, which represent numbers present in both lists.

Here's how the algorithm would work step-by-step:

  1. Take the first number from the first list.
  2. Look at every number in the second list and check if any of them are the same as the number you picked from the first list.
  3. If you find a match, write down that number as one of the numbers that appears in both lists. Then, remove that matched number from the second list so you don't count it again if it appears multiple times in the first list.
  4. If you don't find a match, move on.
  5. Repeat these steps for every number in the first list.
  6. At the end, you'll have a list of all the numbers that appear in both lists.

Code Implementation

def intersection_of_two_arrays_ii_brute_force(first_array, second_array):
    intersection_result = []

    for first_array_index in range(len(first_array)):
        first_array_element = first_array[first_array_index]

        second_array_index = 0
        while second_array_index < len(second_array):
            second_array_element = second_array[second_array_index]

            # Check for common element across arrays
            if first_array_element == second_array_element:

                intersection_result.append(first_array_element)
                del second_array[second_array_index]

                # Exit current search after match
                break
            else:
                second_array_index += 1

    return intersection_result

Big(O) Analysis

Time Complexity
O(n*m)The provided approach iterates through each of the 'n' elements in the first array (nums1). For each of these 'n' elements, it iterates through the second array (nums2), which has 'm' elements, to find a match. In the worst case, we might have to check every element in nums2 for every element in nums1. Therefore, the total number of operations approximates to n * m. Hence, the time complexity is O(n*m).
Space Complexity
O(min(M, N))The provided plain English solution modifies the second list (nums2) by removing elements. In the worst-case scenario, the intersection contains all elements from the smaller list (either nums1 or nums2). Therefore, the space used by modifying nums2 in place (specifically, the potential shift operations after removing elements) depends on the number of elements in that smaller list. Let N be the size of nums1 and M be the size of nums2. In the worst case, we might effectively remove a number of elements proportional to the size of the smaller list, so the space complexity is O(min(M, N)) as an upper bound on the changes needed to the list, assuming list modifications have a linear space component. Note, in many standard library implementations removing an element at an arbitrary index can require a linear number of operations to shift subsequent elements, potentially contributing to space usage during these shifts.

Optimal Solution

Approach

The efficient approach leverages sorting and comparison to find common elements. We first arrange the elements within each collection, and then use a strategy of advancing through each collection based on element comparison to identify and count common elements effectively.

Here's how the algorithm would work step-by-step:

  1. First, arrange the elements within each collection from smallest to largest.
  2. Start by comparing the smallest element of each collection.
  3. If the elements are equal, it means we've found a common element, so we add it to our result.
  4. If the element in the first collection is smaller, move to the next element in the first collection to find one that is larger.
  5. If the element in the second collection is smaller, move to the next element in the second collection to find one that is larger.
  6. Repeat the comparison and moving process until we reach the end of either collection.
  7. The list of elements added to our result represents the common elements and their counts from both collections.

Code Implementation

def intersection_of_two_arrays_ii(first_array, second_array):
    first_array.sort()
    second_array.sort()
    index_first_array = 0
    index_second_array = 0
    result = []

    while index_first_array < len(first_array) and index_second_array < len(second_array):

        # If elements are equal, add to result and advance both
        if first_array[index_first_array] == second_array[index_second_array]:
            result.append(first_array[index_first_array])
            index_first_array += 1
            index_second_array += 1

        # Increment the first array if its value is smaller
        elif first_array[index_first_array] < second_array[index_second_array]:
            index_first_array += 1

        # Increment the second array if its value is smaller
        else:
            index_second_array += 1

    return result

Big(O) Analysis

Time Complexity
O(n log n)The dominant factor in the time complexity is the sorting operation performed on both input arrays. Sorting algorithms like merge sort or quicksort typically have a time complexity of O(n log n), where n is the size of the array. The subsequent comparison of the sorted arrays using two pointers takes O(n) time in the worst case. Since sorting dominates the comparison, the overall time complexity is O(n log n).
Space Complexity
O(min(m, n))The algorithm sorts the input arrays nums1 and nums2. While sorting can often be done in-place or nearly in-place, some sorting algorithms might require additional space depending on the implementation used by the language. Let's assume the sorting algorithm might use O(m) and O(n) space for nums1 and nums2, respectively, where m and n are the lengths of nums1 and nums2. Additionally, the result array to store the intersection elements will take O(min(m, n)) space in the worst case where all elements are common. Therefore, the dominant auxiliary space is either from the sorting or the result array, which can be approximated by O(min(m,n)).

Edge Cases

CaseHow to Handle
One or both input arrays are null or empty.Return an empty list if either input array is null or has zero length.
One array is significantly larger than the other.Use a hash map approach with the smaller array to improve efficiency in lookup.
Both arrays contain all the same numbers.The algorithm should correctly count and intersect the occurrences of each number.
Arrays contain very large numbers (close to integer limits).The solution should handle large numbers without integer overflow issues.
Arrays contain duplicate numbers.The hash map approach naturally handles duplicates by counting their occurrences.
No common elements exist between the two arrays.The algorithm should return an empty list in this case.
Arrays contain negative numbers and zero.The hash map approach works correctly with negative numbers and zero as keys.
Arrays are extremely large, potentially exceeding available memory.Consider a divide-and-conquer approach or external memory algorithms for extremely large datasets.