Taro Logo

Non-overlapping Intervals

Medium
Amazon logo
Amazon
2 views
Topics:
ArraysGreedy Algorithms

Given an array of intervals intervals where intervals[i] = [start_i, end_i], return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.

Note that intervals which only touch at a point are non-overlapping. For example, [1, 2] and [2, 3] are non-overlapping.

Example 1:

Input: intervals = [[1,2],[2,3],[3,4],[1,3]]
Output: 1
Explanation: [1,3] can be removed and the rest of the intervals are non-overlapping.

Example 2:

Input: intervals = [[1,2],[1,2],[1,2]]
Output: 2
Explanation: You need to remove two [1,2] to make the rest of the intervals non-overlapping.

Example 3:

Input: intervals = [[1,2],[2,3]]
Output: 0
Explanation: You don't need to remove any of the intervals since they're already non-overlapping.

Constraints:

  • 1 <= intervals.length <= 10^5
  • intervals[i].length == 2
  • -5 * 10^4 <= start_i < end_i <= 5 * 10^4

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum size of the input `intervals` array, and what is the range of values for the start and end points of the intervals?
  2. Can the intervals be empty (start and end are the same), and if so, how should those be handled?
  3. Are the intervals guaranteed to be sorted in any particular order (e.g., by start time), and if not, should I assume they are unsorted?
  4. What should I return if the input `intervals` array is null or empty?
  5. If multiple sets of removals lead to the same minimum number of removals, is any one of them acceptable?

Brute Force Solution

Approach

The brute force method for non-overlapping intervals involves checking every possible combination of intervals to remove. We try removing one interval at a time, then two at a time, and so on, until we've tried all combinations.

Here's how the algorithm would work step-by-step:

  1. Start with the original list of intervals.
  2. Consider removing each interval individually. For each removal, check if the remaining intervals are non-overlapping.
  3. Next, consider removing every possible pair of intervals. For each pair removal, check if the remaining intervals are non-overlapping.
  4. Continue this process, considering removing every possible set of three intervals, four intervals, and so on, up to removing all but one interval.
  5. For each set of removals, count how many intervals were removed to make the remaining intervals non-overlapping.
  6. From all the possible removal sets that resulted in non-overlapping intervals, choose the set that has the fewest number of intervals removed. This is your answer.

Code Implementation

def non_overlapping_intervals_brute_force(intervals):
    number_of_intervals = len(intervals)
    minimum_removed_intervals = number_of_intervals

    for i in range(1 << number_of_intervals):
        removed_intervals_count = 0
        remaining_intervals = []

        # Determine which intervals to keep based on the bitmask
        for j in range(number_of_intervals):
            if (i >> j) & 1:
                remaining_intervals.append(intervals[j])
            else:
                removed_intervals_count += 1

        # Check if the remaining intervals are non-overlapping.
        is_non_overlapping = True
        remaining_intervals.sort(key=lambda interval: interval[0])

        for k in range(len(remaining_intervals) - 1):
            if remaining_intervals[k][1] > remaining_intervals[k + 1][0]:
                is_non_overlapping = False
                break

        # If the remaining intervals are non-overlapping, update the minimum count.
        if is_non_overlapping:
            # Update our best answer if we improved the number removed
            minimum_removed_intervals = min(minimum_removed_intervals, removed_intervals_count)

    return minimum_removed_intervals

Big(O) Analysis

Time Complexity
O(2^n)The brute force approach explores all possible subsets of intervals. For n intervals, there are 2^n possible subsets (each interval can either be included or excluded). For each subset, we need to check if the remaining intervals are non-overlapping, which takes O(n) time in the worst case. Therefore, the overall time complexity is O(n * 2^n). However, the dominant factor here is the generation of subsets, and even if the non-overlapping check was O(1), we would still have to generate 2^n subsets. Hence, we consider the dominant part 2^n to be Big O.
Space Complexity
O(1)The brute force approach described generates combinations of intervals to remove, but it doesn't explicitly store all possible combinations in memory. The algorithm iterates through different subsets of intervals for removal, checking for non-overlapping conditions. The space used is primarily for a few integer variables to track the number of removed intervals and potentially boolean flags to indicate overlapping status, all of which take constant space. Therefore, the auxiliary space complexity is O(1), meaning it does not grow with the number of intervals N.

Optimal Solution

Approach

The key to efficiently solving this problem is to focus on what intervals to keep, not which ones to remove. We want to keep as many non-overlapping intervals as possible. Sorting the intervals strategically makes the selection process straightforward.

Here's how the algorithm would work step-by-step:

  1. First, sort the intervals based on their end times.
  2. Consider the first interval in the sorted list. Since the intervals are sorted by their ending times, it's usually a great choice to keep the first interval.
  3. Go through the rest of the sorted intervals, one by one.
  4. If an interval does NOT overlap with the interval we kept in the previous step, then we keep this new interval. If it overlaps, we discard it.
  5. Repeat the previous step until we have gone through all of the intervals.
  6. The number of intervals we discarded is our answer.

Code Implementation

def erase_overlapping_intervals(intervals):
    number_of_intervals_to_remove = 0

    # Sort by end times to greedily choose intervals.
    intervals.sort(key=lambda interval: interval[1])

    if not intervals:
        return 0

    # Initialize with the first interval.
    end_time_of_previous_interval = intervals[0][1]

    for interval_start_time, interval_end_time in intervals[1:]:

        # If overlap, increment remove count
        if interval_start_time < end_time_of_previous_interval:
            number_of_intervals_to_remove += 1
        else:
            # Keep non-overlapping interval.
            end_time_of_previous_interval = interval_end_time

    return number_of_intervals_to_remove

Big(O) Analysis

Time Complexity
O(n log n)The dominant factor in the time complexity is the sorting of the intervals. Sorting n intervals using an efficient algorithm like merge sort or quicksort takes O(n log n) time. The subsequent iteration through the sorted intervals to determine overlaps takes O(n) time, as we iterate through the list once. Since O(n log n) grows faster than O(n), the overall time complexity is O(n log n).
Space Complexity
O(1)The algorithm sorts the input intervals in place, meaning no new array is created to store the sorted intervals. It then iterates through the sorted intervals using a few variables to track the last kept interval and current interval. No auxiliary data structures scale with the input size N (number of intervals), resulting in constant auxiliary space.

Edge Cases

CaseHow to Handle
Empty intervals arrayReturn 0 as no intervals need to be removed.
Intervals array with a single intervalReturn 0 as a single interval is inherently non-overlapping.
All intervals are identical, e.g., [[1,2],[1,2],[1,2]]Remove all but one interval; return len(intervals) - 1.
Intervals are already non-overlapping and sorted, e.g., [[1,2],[2,3],[3,4]]Return 0 as no removal is needed.
Intervals are nested, e.g., [[1,5],[2,3],[3,4]]Greedy approach of sorting by end time correctly identifies the intervals to remove to keep smallest end times first.
Intervals with negative start and end points, e.g., [[-2,-1],[1,2]]Algorithm should handle negative values correctly when sorting and comparing intervals.
Intervals with zero length, e.g., [[1,1],[2,3]]Treat zero-length intervals as valid intervals; sorting by end time handles these correctly.
Integer overflow when calculating interval length (though not directly applicable, consider large values)Ensure the sorting and comparison logic don't lead to integer overflows when dealing with very large start and end values; may consider using long data types