Taro Logo

Maximum Average Pass Ratio

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
17 views
Topics:
Greedy AlgorithmsArrays

There is a school that has classes of students and each class will be having a final exam. You are given a 2D integer array classes, where classes[i] = [passi, totali]. You know beforehand that in the ith class, there are totali total students, but only passi number of students will pass the exam.

You are also given an integer extraStudents. There are another extraStudents brilliant students that are guaranteed to pass the exam of any class they are assigned to. You want to assign each of the extraStudents students to a class in a way that maximizes the average pass ratio across all the classes.

The pass ratio of a class is equal to the number of students of the class that will pass the exam divided by the total number of students of the class. The average pass ratio is the sum of pass ratios of all the classes divided by the number of the classes.

Return the maximum possible average pass ratio after assigning the extraStudents students. Answers within 10-5 of the actual answer will be accepted.

Example 1:

Input: classes = [[1,2],[3,5],[2,2]], extraStudents = 2
Output: 0.78333
Explanation: You can assign the two extra students to the first class. The average pass ratio will be equal to (3/4 + 3/5 + 2/2) / 3 = 0.78333.

Example 2:

Input: classes = [[2,4],[3,9],[4,5],[2,10]], extraStudents = 4
Output: 0.53485

Constraints:

  • 1 <= classes.length <= 105
  • classes[i].length == 2
  • 1 <= passi <= totali <= 105
  • 1 <= extraStudents <= 105

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the values of `passes` and `total` in each class, and what are the maximum possible values for `passes` and `total`?
  2. Can the input array `classes` be empty or null, and if so, what should I return?
  3. Can the value of `extraStudents` be zero? What happens if `extraStudents` is very large compared to the size of the `classes` array?
  4. If multiple distributions of `extraStudents` result in the same maximum average pass ratio, is any valid distribution acceptable?
  5. Is `passes` always guaranteed to be less than or equal to `total` for each class? What should I do if `passes` is greater than `total`?

Brute Force Solution

Approach

The brute force approach to maximizing the average pass ratio involves trying every single possible way to add an extra student to different classes. We then calculate the average pass ratio for each of these possibilities and pick the best one. This method is extremely thorough but not efficient.

Here's how the algorithm would work step-by-step:

  1. For each class, consider adding one extra student to it.
  2. Calculate what the new pass ratio would be if you added that student to that class.
  3. Do this for every single class, one at a time, and keep track of the new pass ratio each time.
  4. For each of those new scenarios (each representing one class with one extra student), calculate the overall average pass ratio across all the classes.
  5. Compare all the overall average pass ratios you just calculated.
  6. Choose the scenario (which class got the extra student) that resulted in the highest overall average pass ratio. That's your answer.

Code Implementation

def maximum_average_pass_ratio_brute_force(classes, extra_students):
    best_average_pass_ratio = 0
    best_class_index = -1

    # Iterate through each class.
    for class_index in range(len(classes)):
        passes, total = classes[class_index]

        # Simulate adding an extra student to the class.
        new_passes = passes + extra_students
        new_total = total + extra_students

        # Ensure that new_passes does not exceed new_total.
        new_passes = min(new_passes, new_total)

        # Calculate the new average pass ratio.
        current_average_pass_ratio = 0
        for i in range(len(classes)):
            if i == class_index:
                current_average_pass_ratio += new_passes / new_total
            else:
                passes_other, total_other = classes[i]
                current_average_pass_ratio += passes_other / total_other

        current_average_pass_ratio /= len(classes)

        # Track if the new average pass ratio is better.
        if current_average_pass_ratio > best_average_pass_ratio:
            best_average_pass_ratio = current_average_pass_ratio
            best_class_index = class_index

    # If no class improves average pass ratio, use original ratios
    if best_class_index == -1:
        total_average_pass_ratio = 0
        for passes, total in classes:
            total_average_pass_ratio += passes / total
        return total_average_pass_ratio / len(classes)

    # Recalculate best average with chosen class incremented
    passes, total = classes[best_class_index]

    new_passes = passes + extra_students
    new_total = total + extra_students

    new_passes = min(new_passes, new_total)

    best_average_pass_ratio = 0
    for i in range(len(classes)):
        if i == best_class_index:
            best_average_pass_ratio += new_passes / new_total
        else:
            passes_other, total_other = classes[i]
            best_average_pass_ratio += passes_other / total_other

    return best_average_pass_ratio / len(classes)

Big(O) Analysis

Time Complexity
O(n)The described algorithm iterates through each of the n classes. For each class, it calculates the new pass ratio if one student is added to that class, and then calculates the overall average pass ratio across all classes. Since the calculation within each iteration takes constant time, and we iterate through all n classes, the overall time complexity is driven by the single loop through the classes array. Thus, the time complexity is O(n).
Space Complexity
O(1)The provided plain English solution describes a brute-force approach where we iterate through the classes one at a time, compute a new pass ratio for each, and compare the overall average pass ratios. No auxiliary data structures like arrays, lists, or hash maps that scale with the number of classes (let's denote this as N) are used to store intermediate states or scenarios. Only a few variables are needed to track the best average and the index of the corresponding class. Therefore, the space used is constant regardless of the number of classes.

Optimal Solution

Approach

The key to maximizing the overall average pass ratio lies in strategically allocating extra students to classes where they will have the most impact. This is done by prioritizing classes where adding a student significantly improves the pass ratio, thus driving up the average more effectively.

Here's how the algorithm would work step-by-step:

  1. First, figure out how much each class's pass ratio would improve if we added one more student who passes.
  2. Then, find the class where adding a student would increase the pass ratio the MOST.
  3. Give that class one more student. Update the number of students and the number of passing students for that class.
  4. Recalculate the improvement each class would see by adding another student (since the ratios have changed).
  5. Repeat steps 2-4 until you've used up all the extra students you can add.
  6. Finally, calculate the average pass ratio across all classes based on the final counts.

Code Implementation

import heapq

def maximum_average_pass_ratio(classes, extra_students):
    # Use a max-heap to track classes with the highest potential improvement.
    heap = []
    for passing_students, total_students in classes:
        improvement = (passing_students + 1) / (total_students + 1) - passing_students / total_students
        heapq.heappush(heap, (-improvement, passing_students, total_students))

    # Greedily allocate extra students to the classes with the most improvement.
    for _ in range(extra_students):
        negative_improvement, passing_students, total_students = heapq.heappop(heap)

        # Increment students and update heap with the new improvement value.
        passing_students += 1
        total_students += 1
        improvement = (passing_students + 1) / (total_students + 1) - passing_students / total_students
        heapq.heappush(heap, (-improvement, passing_students, total_students))

    # After allocation, calculate the final average pass ratio.
    average_pass_ratio = 0.0
    while heap:
        _, passing_students, total_students = heapq.heappop(heap)
        average_pass_ratio += passing_students / total_students

    return average_pass_ratio / len(classes)

Big(O) Analysis

Time Complexity
O(n log n + extraStudents log n)The algorithm iterates 'extraStudents' times to distribute the extra students among the classes. In each iteration, it needs to find the class that maximizes the pass ratio improvement. Finding the maximum improvement can be efficiently done using a priority queue (heap) of size 'n' (number of classes). Building the initial heap takes O(n) time. Each iteration involves extracting the maximum (O(1)), updating the heap (O(log n) after modifying the class's pass and total students), and re-inserting the modified class back into the heap (O(log n)). Thus the loop runs 'extraStudents' times with a cost of O(log n) per iteration, yielding O(extraStudents log n). Since we create the heap initially, the total time complexity is O(n + extraStudents log n). If extraStudents is much larger than n, it approximates O(extraStudents log n). If extraStudents is small, n will dominate, it’s O(n log n). The problem usually has extraStudents > n, so we consider the total complexity O(n log n + extraStudents log n).
Space Complexity
O(1)The provided plain English explanation focuses on iterative updates and calculations performed directly on the input data. While the algorithm implicitly maintains temporary variables for calculations like improvement in pass ratio and tracking the class with the maximum improvement, these are constant in number and do not scale with the input size, N, where N represents the number of classes. No auxiliary data structures like lists, maps, or recursion stacks are created that depend on N. Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Empty classes arrayReturn 0 if the classes array is empty, as there are no pass ratios to average.
ExtraStudents is zeroIf extraStudents is zero, return the initial average pass ratio calculated without adding any students.
Classes with zero students and zero passingHandle classes with 0/0 case by either skipping them in the average calculation or defining the pass ratio as 0 or 1, depending on the problem context.
Classes with all students passing already (pass ratio of 1)Avoid adding extra students to classes that already have a pass ratio of 1 to prevent floating point imprecision issues and wasted operations.
Large ExtraStudents value exceeding total students in multiple classesEnsure that we don't add more students than needed to bring the pass ratio of each class to 1, stopping when extraStudents becomes 0.
Floating-point precision errors during pass ratio calculationsUse appropriate data types (e.g., double) and consider a small tolerance when comparing floating-point numbers, like in the heap comparison.
Integer overflow in pass, total, or ExtraStudentsUse long data types for pass, total, and ExtraStudents to prevent potential integer overflows during calculations.
Classes array with very large number of elements, impacting performanceEnsure the heap based solution implementation scales well by using efficient heap operations (logarithmic complexity) to maintain overall performance.