Taro Logo

Non-overlapping Intervals

Medium
Amazon logo
Amazon
Topics:
ArraysGreedy Algorithms

Given an array of intervals intervals where intervals[i] = [start_i, end_i], return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.

Note that intervals which only touch at a point are non-overlapping. For example, [1, 2] and [2, 3] are non-overlapping.

Example 1:

Input: intervals = [[1,2],[2,3],[3,4],[1,3]]
Output: 1
Explanation: [1,3] can be removed and the rest of the intervals are non-overlapping.

Example 2:

Input: intervals = [[1,2],[1,2],[1,2]]
Output: 2
Explanation: You need to remove two [1,2] to make the rest of the intervals non-overlapping.

Example 3:

Input: intervals = [[1,2],[2,3]]
Output: 0
Explanation: You don't need to remove any of the intervals since they're already non-overlapping.

Constraints:

  • 1 <= intervals.length <= 10^5
  • intervals[i].length == 2
  • -5 * 10^4 <= start_i < end_i <= 5 * 10^4

Solution


Minimum Number of Intervals to Remove to Make Non-Overlapping

This problem asks us to find the minimum number of intervals to remove from a given set of intervals such that the remaining intervals are non-overlapping.

Naive Approach (Brute Force)

A brute-force approach would involve generating all possible subsets of the intervals and, for each subset, checking if the intervals are non-overlapping. We keep track of the maximum size of non-overlapping subsets. The number of intervals to remove would be the original number of intervals minus the maximum size of a non-overlapping subset. This approach is highly inefficient.

  • Time Complexity: O(2n * n), where n is the number of intervals (2n to generate all subsets, and O(n) to check for overlaps in each subset).
  • Space Complexity: O(n) in the worst case to store a subset.

Optimal Approach (Greedy)

A more efficient approach uses a greedy algorithm. The key idea is to sort the intervals based on their end times. Then, we iterate through the sorted intervals and keep track of the last non-overlapping interval. If the current interval overlaps with the last non-overlapping interval, we increment the count of intervals to remove. Otherwise, we update the last non-overlapping interval to the current interval.

Algorithm:

  1. Sort the intervals in ascending order of their end times.
  2. Initialize remove_count to 0.
  3. Initialize last_end to negative infinity (or the smallest possible value).
  4. Iterate through the sorted intervals:
    • If the current interval's start time is less than last_end, then it overlaps with the previous non-overlapping interval. Increment remove_count.
    • Otherwise, update last_end to the end time of the current interval.
  5. Return remove_count.

Example:

Let's consider the intervals [[1,2],[2,3],[3,4],[1,3]].

  1. Sort by end times: [[1,2],[2,3],[3,4],[1,3]] becomes [[1,2],[2,3],[1,3],[3,4]]
  2. remove_count = 0
  3. last_end = -infinity

Iteration:

  • [1,2]: 1 >= last_end. last_end = 2
  • [2,3]: 2 >= last_end. last_end = 3
  • [1,3]: 1 < last_end. remove_count = 1
  • [3,4]: 3 >= last_end. last_end = 4

Return remove_count = 1

Code (Python):

def eraseOverlapIntervals(intervals):
    intervals.sort(key=lambda x: x[1])  # Sort by end times
    remove_count = 0
    last_end = float('-inf')

    for start, end in intervals:
        if start < last_end:
            remove_count += 1
        else:
            last_end = end

    return remove_count
  • Time Complexity: O(n log n) due to the sorting step, where n is the number of intervals. The iteration is O(n), but the sort dominates.
  • Space Complexity: O(1) (or O(n) depending on the sorting algorithm implementation). In-place sorting algorithms would require O(1) extra space. Algorithms that create a copy of the array to sort require O(n) space. In Python, the sort function is implemented using Timsort, which has an average space complexity of O(n).

Edge Cases

  • Empty input: If the input array is empty, no intervals need to be removed, so return 0.
  • Single interval: If there's only one interval, it's already non-overlapping, so return 0.
  • Intervals with the same start and end times: The algorithm handles these correctly as long as the intervals are sorted by end times.