Taro Logo

Relative Sort Array

Easy
DE Shaw logo
DE Shaw
1 view
Topics:
Arrays

Given two arrays arr1 and arr2, the elements of arr2 are distinct, and all elements in arr2 are also in arr1.

Sort the elements of arr1 such that the relative ordering of items in arr1 are the same as in arr2. Elements that do not appear in arr2 should be placed at the end of arr1 in ascending order.

Example 1:

Input: arr1 = [2,3,1,3,2,4,6,7,9,2,19], arr2 = [2,1,4,3,9,6]
Output: [2,2,2,1,4,3,3,9,6,7,19]

Example 2:

Input: arr1 = [28,6,22,8,44,17], arr2 = [22,28,8,6]
Output: [22,28,8,6,17,44]

Constraints:

  • 1 <= arr1.length, arr2.length <= 1000
  • 0 <= arr1[i], arr2[i] <= 1000
  • All the elements of arr2 are distinct.
  • Each arr2[i] is in arr1.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected range of integer values within `arr1` and `arr2`? Can I assume they fit within typical integer bounds?
  2. Can `arr1` or `arr2` be empty or null?
  3. Are all elements in `arr2` guaranteed to be present in `arr1`? What should I return if `arr2` contains elements not in `arr1`?
  4. Can `arr1` contain duplicate values, and how should these duplicates be handled with respect to their relative ordering after elements from `arr2` are placed?
  5. Is the order of elements in `arr1` that are *not* present in `arr2` important? Should they be sorted in ascending order?

Brute Force Solution

Approach

The brute force approach to sorting one set of numbers according to the order of another set is quite simple. We essentially go through the first set, checking for each number if it exists in the desired order and if so, collect it. Anything left at the end we just put in the output unsorted.

Here's how the algorithm would work step-by-step:

  1. Go through the list of numbers that needs to be sorted.
  2. For each number, check if it appears in the list that defines the desired order.
  3. If it does, put it into a new sorted list, as many times as it appeared in the original list.
  4. After going through all the numbers, collect any numbers that were in the original list but not present in the list that defines the desired order.
  5. Add these remaining numbers to the end of the sorted list.
  6. Return the resulting list which has the numbers sorted according to the defined order, with any remaining numbers appended at the end.

Code Implementation

def relative_sort_array_brute_force(array_to_sort, desired_order):
    sorted_array = []
    remaining_elements = []

    # Iterate through the input array.
    for element in array_to_sort:

        found_in_desired_order = False
        for ordered_element in desired_order:
            if element == ordered_element:
                sorted_array.append(element)
                found_in_desired_order = True
                break

        # Collect elements not in the desired order for later.
        if not found_in_desired_order:
            remaining_elements.append(element)

    # Combine sorted elements and remaining elements.
    sorted_array.extend(remaining_elements)

    return sorted_array

Big(O) Analysis

Time Complexity
O(n*m)Let n be the size of the array to be sorted (arr1) and m be the size of the array defining the order (arr2). The algorithm iterates through arr1, which takes O(n) time. For each element in arr1, it checks if that element exists in arr2, which takes O(m) in the worst case (e.g., using a linear search). Therefore, the overall time complexity is O(n*m) since we perform a potentially O(m) operation for each of the n elements.
Space Complexity
O(N)The algorithm creates a new sorted list to store the sorted elements. In the worst-case scenario, where all elements from the original list (arr1) are added to this new list, the size of this list will be proportional to N, where N is the number of elements in arr1. The algorithm might also use space to count occurrences of numbers which could, in the worst case, also be proportional to N if all elements are unique. Therefore, the auxiliary space complexity is O(N).

Optimal Solution

Approach

We want to sort one list of numbers based on the order they appear in another list. We'll efficiently accomplish this by first noting the frequency of each number, then building the result based on that information.

Here's how the algorithm would work step-by-step:

  1. First, count how many times each number appears in the list we want to sort.
  2. Then, go through the list that dictates the desired order. For each number in this list, add it to our sorted result as many times as it appeared in the original list, according to our counts.
  3. After processing the numbers from the ordering list, handle any remaining numbers that were not present in the ordering list. Sort these remaining numbers and add them to the end of our sorted result.

Code Implementation

def relative_sort_array(array_to_sort, ordering_array):
    number_frequencies = {}
    for number in array_to_sort:
        number_frequencies[number] = number_frequencies.get(number, 0) + 1

    sorted_array = []

    # Prioritize sorting based on the order in ordering_array
    for number in ordering_array:
        if number in number_frequencies:
            sorted_array.extend([number] * number_frequencies[number])
            del number_frequencies[number]

    # Collect remaining numbers not in ordering_array
    remaining_numbers = []
    for number, frequency in number_frequencies.items():
        remaining_numbers.extend([number] * frequency)

    # Sort any numbers not found in ordering_array
    remaining_numbers.sort()

    # Combine sorted and remaining for the final result.
    sorted_array.extend(remaining_numbers)

    return sorted_array

Big(O) Analysis

Time Complexity
O(n + m log m)Let n be the length of arr1 (the array to be sorted) and m be the length of arr2 (the array that dictates the sorting order). The first step, counting frequencies in arr1, takes O(n) time. The second step, iterating through arr2 and adding elements to the result, takes O(m) time, where, in the worst case, we iterate through arr2 and add elements to the result based on their counts. The final step involves sorting the remaining elements from arr1, which, in the worst case, could have almost n elements. Sorting these remaining elements using an efficient sorting algorithm (like merge sort or quicksort) would take O(k log k) time where k is the number of elements which were not found in arr2 and thus remain at the end of arr1. In the worst case, k can be proportional to n. Therefore, this is O(n log n). Thus, the overall complexity is O(n + m + n log n). Since m can be at most n, we can simplify to O(n + n log n), which is further simplified to O(n log n). However, we must consider the case where m > n. In that case, the overall complexity is O(n + m). However, if we let 'k' denote the number of remaining elements in arr1 after accounting for elements in arr2 and these remaining elements are sorted. Then the overall complexity is O(n + k log k + m). Since k can, at most be n and k log k will then become n log n. Assuming that all elements in arr2 exist within arr1 (the worst-case scenario to drive arr2 operations), we can therefore say that the overall time complexity is O(n + m log m).
Space Complexity
O(N)The algorithm uses a frequency count, effectively creating a hash map or an array (if the numbers are within a small range) to store the counts of each number in the input array arr1. In the worst case, all the numbers in arr1 are distinct, requiring space proportional to the number of elements in arr1, which we denote as N. Therefore, the auxiliary space used for the frequency count scales linearly with the size of the input array, arr1. While the final sorted array might be created in place conceptually, the count array definitely uses O(N) auxiliary space where N is the size of arr1.

Edge Cases

CaseHow to Handle
arr1 or arr2 is null or emptyReturn an empty array if arr1 is null or empty, and handle a null arr2 gracefully, perhaps treating it as if empty while sorting arr1's remaining elements.
arr1 is much larger than arr2The solution should efficiently handle large differences in array sizes, possibly using a counting sort for arr1's remaining elements after processing based on arr2.
arr2 contains duplicate elementsEnsure the algorithm handles duplicate elements in arr2 gracefully, potentially only processing the first occurrence of each element and ignoring subsequent duplicates.
arr1 contains negative or zero valuesHandle negative or zero values present in arr1 by adjusting the counting sort implementation or using a general-purpose sorting algorithm for the remaining elements.
arr1 contains extremely large numbersConsider using a HashMap or TreeMap instead of an array-based counting sort to handle potentially large values efficiently without excessive memory allocation.
arr1 contains duplicates that are not present in arr2Ensure the solution correctly sorts the remaining elements from arr1 (not present in arr2) along with their respective duplicates.
Elements in arr1 are outside the range of typical integer sizes (overflow potential during counting sort).Use a data structure with sufficient capacity to store the counts, such as a long or a hash map to avoid integer overflow during counting sort.
All elements in arr1 are already in arr2The solution should iterate through arr2 and add all corresponding elements from arr1 into the result array without any remaining elements to sort.