Taro Logo

Single-Threaded CPU

Medium
DoorDash logo
DoorDash
16 views
Topics:
Greedy Algorithms

You are given n​​​​​​ tasks labeled from 0 to n - 1 represented by a 2D integer array tasks, where tasks[i] = [enqueueTimei, processingTimei] means that the i​​​​​​th​​​​ task will be available to process at enqueueTimei and will take processingTimei to finish processing.

You have a single-threaded CPU that can process at most one task at a time and will act in the following way:

  • If the CPU is idle and there are no available tasks to process, the CPU remains idle.
  • If the CPU is idle and there are available tasks, the CPU will choose the one with the shortest processing time. If multiple tasks have the same shortest processing time, it will choose the task with the smallest index.
  • Once a task is started, the CPU will process the entire task without stopping.
  • The CPU can finish a task then start a new one instantly.

Return the order in which the CPU will process the tasks.

Example 1:

Input: tasks = [[1,2],[2,4],[3,2],[4,1]]
Output: [0,2,3,1]
Explanation: The events go as follows: 
- At time = 1, task 0 is available to process. Available tasks = {0}.
- Also at time = 1, the idle CPU starts processing task 0. Available tasks = {}.
- At time = 2, task 1 is available to process. Available tasks = {1}.
- At time = 3, task 2 is available to process. Available tasks = {1, 2}.
- Also at time = 3, the CPU finishes task 0 and starts processing task 2 as it is the shortest. Available tasks = {1}.
- At time = 4, task 3 is available to process. Available tasks = {1, 3}.
- At time = 5, the CPU finishes task 2 and starts processing task 3 as it is the shortest. Available tasks = {1}.
- At time = 6, the CPU finishes task 3 and starts processing task 1. Available tasks = {}.
- At time = 10, the CPU finishes task 1 and becomes idle.

Example 2:

Input: tasks = [[7,10],[7,12],[7,5],[7,4],[7,2]]
Output: [4,3,2,0,1]
Explanation: The events go as follows:
- At time = 7, all the tasks become available. Available tasks = {0,1,2,3,4}.
- Also at time = 7, the idle CPU starts processing task 4. Available tasks = {0,1,2,3}.
- At time = 9, the CPU finishes task 4 and starts processing task 3. Available tasks = {0,1,2}.
- At time = 13, the CPU finishes task 3 and starts processing task 2. Available tasks = {0,1}.
- At time = 18, the CPU finishes task 2 and starts processing task 0. Available tasks = {1}.
- At time = 28, the CPU finishes task 0 and starts processing task 1. Available tasks = {}.
- At time = 40, the CPU finishes task 1 and becomes idle.

Constraints:

  • tasks.length == n
  • 1 <= n <= 105
  • 1 <= enqueueTimei, processingTimei <= 109

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible values for the tasks' enqueue times and processing times? Are they integers?
  2. Can the input array of tasks be empty? What should I return in that case?
  3. If multiple tasks are available to process at the current CPU time, how should I prioritize them (e.g., by index, shortest processing time)?
  4. Is the input array sorted by enqueue time, or can it be in any order?
  5. Are the task indices guaranteed to be unique, or could multiple tasks have the same index?

Brute Force Solution

Approach

The brute force approach to scheduling tasks involves trying every possible order to find the most efficient one. We explore each sequence of tasks, evaluating how long the CPU takes to process them. This ensures we consider all potential schedules, even if it's time-consuming.

Here's how the algorithm would work step-by-step:

  1. First, make a list of all possible orders in which the tasks can be done.
  2. For each order, pretend to run the tasks on the CPU one after another.
  3. Keep track of the current time as you complete each task in the current order.
  4. For each task, determine the time it finishes based on when it becomes available and how long it takes to run.
  5. If a task is not available yet, wait until it is before starting it.
  6. After completing all tasks in the current order, note down the total time it took to finish everything.
  7. Once you have gone through all possible orders, find the order that resulted in the shortest total completion time.
  8. That shortest time and the corresponding order are the solution.

Code Implementation

import itertools

def single_threaded_cpu(tasks):
    number_of_tasks = len(tasks)
    task_indices = list(range(number_of_tasks))
    possible_task_orders = list(itertools.permutations(task_indices))
    minimum_completion_time = float('inf')
    best_task_order = None

    for task_order in possible_task_orders:
        current_time = 0
        total_completion_time_for_order = 0

        for task_index in task_order:
            arrival_time, burst_time = tasks[task_index]

            # We need to wait until the task is available
            current_time = max(current_time, arrival_time)
            
            current_time += burst_time
            total_completion_time_for_order = current_time

        # Check if the current order yields a new minimum
        if total_completion_time_for_order < minimum_completion_time:
            minimum_completion_time = total_completion_time_for_order
            best_task_order = task_order

    return minimum_completion_time

Big(O) Analysis

Time Complexity
O(n! * n)The algorithm first generates all possible permutations of the n tasks. There are n! (n factorial) such permutations. For each permutation, the algorithm iterates through the tasks to simulate the CPU execution and calculate the completion time. This simulation involves iterating through the n tasks in the current permutation. Therefore, the overall time complexity is proportional to n! (for generating permutations) multiplied by n (for simulating the execution of each permutation), resulting in O(n! * n).
Space Complexity
O(N!)The algorithm generates all possible orderings of the tasks, resulting in N! permutations, where N is the number of tasks. To store each permutation, a list of size N is required. Therefore, the space complexity is dominated by storing these N! permutations, leading to O(N!) auxiliary space.

Optimal Solution

Approach

The core idea is to simulate how a CPU would process tasks one at a time, considering their dependencies. We keep track of when each task becomes available for processing based on when its dependencies are completed. This way, we only process tasks that are truly ready, minimizing idle time.

Here's how the algorithm would work step-by-step:

  1. First, create a list that stores the earliest time each task can be started. Initially, this time is zero for tasks without dependencies and the completion time of their dependencies for those with dependencies.
  2. Keep track of the current time of the CPU.
  3. Find the task that is both available to run (its earliest start time is in the past) and has the smallest index (to resolve ties).
  4. If there is no available task, advance the CPU's time to the earliest time a task will become available. This means the CPU was idle.
  5. Process the chosen task, updating the CPU's current time by adding the task's duration.
  6. Update the earliest start times for any tasks that depended on the completed task, making sure they can only start after the current task has finished.
  7. Repeat steps 3-6 until all tasks are completed. The final CPU time is the answer.

Code Implementation

def single_threaded_cpu(tasks): 
    number_of_tasks = len(tasks)
    start_times = [0] * number_of_tasks
    current_time = 0
    completed_tasks = 0

    while completed_tasks < number_of_tasks:
        available_task_index = -1
        min_index = float('inf')

        for task_index in range(number_of_tasks):
            # Find available tasks
            if start_times[task_index] <= current_time:
                if task_index < min_index:
                    min_index = task_index
                    available_task_index = task_index

        # If no tasks are available, advance time
        if available_task_index == -1:
            min_next_start_time = float('inf')
            for task_index in range(number_of_tasks):
                if start_times[task_index] > current_time:
                    min_next_start_time = min(min_next_start_time, start_times[task_index])
            current_time = min_next_start_time
        else:
            # Process the task and update time
            task_duration = tasks[available_task_index][0]
            dependencies = tasks[available_task_index][1]
            current_time += task_duration
            completed_tasks += 1

            # Update start times of dependent tasks
            for dependent_task_index in range(number_of_tasks):
                if available_task_index in tasks[dependent_task_index][1]:
                    start_times[dependent_task_index] = max(start_times[dependent_task_index], current_time)

            start_times[available_task_index] = float('inf')

    return current_time

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates until all 'n' tasks are completed. Inside the loop, finding the next available task with the smallest index requires examining the array of earliest start times, which takes O(n) time in the worst case. Updating the earliest start times of dependent tasks after processing a task also involves iterating through the 'n' tasks to check dependencies. Therefore, the dominant operations inside the main loop are both O(n). Since the main loop runs until all 'n' tasks are done, the overall time complexity is O(n * n), which simplifies to O(n²).
Space Complexity
O(N)The algorithm uses a list to store the earliest start time for each task, which requires O(N) space where N is the total number of tasks. Additionally, updating the earliest start times for dependent tasks might use a constant amount of extra memory per task during the update. Therefore, the dominant factor in auxiliary space is the list of earliest start times, making the overall space complexity O(N).

Edge Cases

CaseHow to Handle
Empty tasks arrayReturn an empty list since there are no tasks to process.
Tasks with all identical enqueue timesThe solution should still process tasks in order of increasing processing time when enqueue times are equal.
Tasks with zero or negative processing timeAssume processing time will always be positive, so the solution proceeds under this constraint, otherwise clarify assumptions with the interviewer.
Integer overflow when calculating completion timeUse a data type with a larger range (e.g., long) or modulo operation to prevent overflow.
Large number of tasks with small processing times intermixed with a single task with a very large processing timeThe single large task will dominate and delay the execution of other tasks, so the algorithm needs to be performant to handle a long idle time during this interval.
Tasks arrive in reverse order of enqueue timeThe solution relies on sorting or a priority queue to maintain tasks in the correct order regardless of arrival sequence.
Tasks enqueue while the CPU is idle.The CPU clock must update to the next task enqueue time if the CPU is currently idle.
Maximum number of tasks leading to possible memory limitations.Consider using a more memory-efficient data structure or algorithm if the number of tasks is extremely large, potentially impacting the priority queue or task list.