Taro Logo

Create Sorted Array through Instructions #6 Most Asked

Hard
Topics:
ArraysBinary Search

Given an integer array instructions, you are asked to create a sorted array from the elements in instructions. You start with an empty container nums. For each element from left to right in instructions, insert it into nums. The cost of each insertion is the minimum of the following:

  • The number of elements currently in nums that are strictly less than instructions[i].
  • The number of elements currently in nums that are strictly greater than instructions[i].

For example, if inserting element 3 into nums = [1,2,3,5], the cost of insertion is min(2, 1) (elements 1 and 2 are less than 3, element 5 is greater than 3) and nums will become [1,2,3,3,5].

Return the total cost to insert all elements from instructions into nums. Since the answer may be large, return it modulo 109 + 7

Example 1:

Input: instructions = [1,5,6,2]
Output: 1
Explanation: Begin with nums = [].
Insert 1 with cost min(0, 0) = 0, now nums = [1].
Insert 5 with cost min(1, 0) = 0, now nums = [1,5].
Insert 6 with cost min(2, 0) = 0, now nums = [1,5,6].
Insert 2 with cost min(1, 2) = 1, now nums = [1,2,5,6].
The total cost is 0 + 0 + 0 + 1 = 1.

Example 2:

Input: instructions = [1,2,3,6,5,4]
Output: 3
Explanation: Begin with nums = [].
Insert 1 with cost min(0, 0) = 0, now nums = [1].
Insert 2 with cost min(1, 0) = 0, now nums = [1,2].
Insert 3 with cost min(2, 0) = 0, now nums = [1,2,3].
Insert 6 with cost min(3, 0) = 0, now nums = [1,2,3,6].
Insert 5 with cost min(3, 1) = 1, now nums = [1,2,3,5,6].
Insert 4 with cost min(3, 2) = 2, now nums = [1,2,3,4,5,6].
The total cost is 0 + 0 + 0 + 0 + 1 + 2 = 3.

Example 3:

Input: instructions = [1,3,3,3,2,4,2,1,2]
Output: 4
Explanation: Begin with nums = [].
Insert 1 with cost min(0, 0) = 0, now nums = [1].
Insert 3 with cost min(1, 0) = 0, now nums = [1,3].
Insert 3 with cost min(1, 0) = 0, now nums = [1,3,3].
Insert 3 with cost min(1, 0) = 0, now nums = [1,3,3,3].
Insert 2 with cost min(1, 3) = 1, now nums = [1,2,3,3,3].
Insert 4 with cost min(5, 0) = 0, now nums = [1,2,3,3,3,4].
​​​​​​​Insert 2 with cost min(1, 4) = 1, now nums = [1,2,2,3,3,3,4].
​​​​​​​Insert 1 with cost min(0, 6) = 0, now nums = [1,1,2,2,3,3,3,4].
​​​​​​​Insert 2 with cost min(2, 4) = 2, now nums = [1,1,2,2,2,3,3,3,4].
The total cost is 0 + 0 + 0 + 0 + 1 + 0 + 1 + 0 + 2 = 4.

Constraints:

  • 1 <= instructions.length <= 105
  • 1 <= instructions[i] <= 105

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum possible value of an element within the `instructions` array, and what is the maximum length of the `instructions` array itself?
  2. Can the `instructions` array contain duplicate values, and if so, how should the cost be calculated when multiple identical elements need to be inserted?
  3. If an element in `instructions` already exists in the sorted array, does its insertion still incur a cost, and how is that cost calculated with respect to the existing duplicates?
  4. The problem statement mentions the cost can be very large. Should I be concerned about integer overflow when calculating the total cost, and should I return the cost modulo some value? If so, what is that value?
  5. Is there a lower bound for any of the numbers in the instructions array? Are negative numbers, zero, or fractional values allowed?

Brute Force Solution

Approach

We want to build a sorted list by inserting numbers one at a time. The brute force way to find the cost is to look at every possible place to insert each number and see how much it costs to do so.

Here's how the algorithm would work step-by-step:

  1. Take the first number we want to insert.
  2. Try inserting it at the beginning of our (initially empty) sorted list.
  3. Count how many numbers are smaller than it that are now to its right (the cost).
  4. Try inserting it in the next possible spot.
  5. Count how many numbers are smaller than it that are now to its right (the cost).
  6. Keep doing this for every possible spot in the list.
  7. Pick the spot that has the smallest cost.
  8. Add that cost to our running total cost.
  9. Repeat this whole process for every number we want to insert, adding the cost of each insertion to the total cost as we go.
  10. The final total cost is our answer.

Code Implementation

def create_sorted_array_brute_force(instructions):
    sorted_array = []
    total_cost = 0

    for instruction_index in range(len(instructions)):
        instruction = instructions[instruction_index]
        minimum_cost = float('inf')
        best_insertion_index = -1

        for insertion_index in range(len(sorted_array) + 1):
            # Try every possible index for insertion
            temp_array = sorted_array[:]
            temp_array.insert(insertion_index, instruction)
            current_cost = 0

            # Calculate insertion cost at current index
            for check_index in range(len(temp_array)):
                if check_index > insertion_index and temp_array[check_index] < instruction:
                    current_cost += 1

            if current_cost < minimum_cost:
                # Keep track of the minimum cost index
                minimum_cost = current_cost
                best_insertion_index = insertion_index

        total_cost += minimum_cost

        # Insert the new number into our array
        if best_insertion_index == len(sorted_array):
            sorted_array.append(instruction)

        else:
            sorted_array.insert(best_insertion_index, instruction)

    return total_cost

Big(O) Analysis

Time Complexity
O(n²)For each of the n instructions, we iterate through the sorted list to find the optimal insertion point. Finding the cost of inserting at each position involves comparing the value to be inserted with existing elements in the list, which takes up to n comparisons in the worst case. Thus, for each of the n instructions, we perform an operation that takes O(n) time. Consequently, the overall time complexity is O(n * n), or O(n²).
Space Complexity
O(1)The plain English explanation outlines a brute-force approach that iteratively inserts elements into a sorted list. While the explanation describes trying different insertion points and calculating costs, it doesn't explicitly mention the creation of auxiliary data structures like temporary arrays or hash maps that scale with the input size. The cost calculation itself is performed using the existing list, without allocating significant extra memory. Therefore, the auxiliary space complexity is O(1), representing constant extra space beyond the input instructions.

Optimal Solution

Approach

Building the sorted array efficiently involves tracking how many numbers are smaller or larger than each new number as it's added. By using a special data structure, we can quickly find these counts and calculate the cost, ultimately minimizing the total expense of sorting.

Here's how the algorithm would work step-by-step:

  1. Imagine an empty container. This container will hold the sorted array we're building.
  2. For each new number we want to add, we need to know how many numbers in our container are smaller and how many are bigger.
  3. Use a clever data structure, like a special type of tree, to help us quickly find those smaller and bigger counts without checking every single number each time.
  4. The cost of adding each new number is the smaller of the two counts: either the number of smaller numbers or the number of bigger numbers.
  5. Add that minimum cost to our running total cost.
  6. Put the new number into our special tree so it stays updated.
  7. Repeat this process for all the numbers we want to add.
  8. The final running total cost is the minimum cost to build the sorted array.

Code Implementation

import bisect

def create_sorted_array(instructions):
    sorted_array = []
    total_cost = 0
    
    for instruction in instructions:
        # Find how many elements are smaller
        smaller_count = bisect.bisect_left(sorted_array, instruction)
        # Find how many elements are larger
        larger_count = len(sorted_array) - bisect.bisect_right(sorted_array, instruction)
        
        # Choose the minimum cost
        cost = min(smaller_count, larger_count)
        total_cost += cost
        
        # Insert into the sorted array to maintain order
        bisect.insort(sorted_array, instruction)
        
    return total_cost % (10**9 + 7)

Big(O) Analysis

Time Complexity
O(n log n)The algorithm iterates through each of the n instructions. For each instruction, it performs two operations: finding the number of smaller and larger elements and inserting the new element into the sorted array. Both of these operations, when using an efficient data structure like a balanced binary search tree or a Fenwick tree (Binary Indexed Tree), can be done in O(log n) time. Therefore, the overall time complexity is n iterations each taking O(log n) time, resulting in O(n log n).
Space Complexity
O(N)The described solution uses a special tree data structure to efficiently find the number of smaller and larger elements. This tree stores the elements encountered so far, and in the worst case, it might need to store all N elements from the input instructions array. Therefore, the auxiliary space used by the tree scales linearly with the input size N. Hence, the space complexity is O(N).

Edge Cases

Empty instructions array
How to Handle:
Return 0, as no instructions mean no cost.
Instructions array with a single element
How to Handle:
The cost will be 0, so return 0.
Instructions array with all identical elements
How to Handle:
The cost increases linearly since insertion position is predictable.
Instructions array with monotonically increasing or decreasing elements
How to Handle:
The cost increases linearly, close to optimal for sorted input.
Instructions array containing large numbers (potential for integer overflow in cost calculation)
How to Handle:
Use modular arithmetic (modulo 10^9 + 7) during cost accumulation to prevent integer overflow.
Maximum-sized instructions array (performance considerations)
How to Handle:
Use an efficient data structure like a Fenwick Tree or Segment Tree for insert position counting, ensuring logarithmic time complexity per instruction.
Instructions array with extreme boundary values (e.g., very large or very small numbers)
How to Handle:
Ensure the chosen data structure (e.g., Fenwick Tree or Segment Tree) can handle the range of input values without memory issues.
Duplicate instructions requiring the same insertion point multiple times
How to Handle:
Fenwick Tree will correctly update counts even with frequent duplicate insertions.
0/0 completed