Non-overlapping Intervals

Medium
5 days ago

Given an array of intervals intervals where intervals[i] = [start<sub>i</sub>, end<sub>i</sub>], return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping. Note that intervals which only touch at a point are non-overlapping. For example, [1, 2] and [2, 3] are non-overlapping.

Example 1: Input: intervals = [[1,2],[2,3],[3,4],[1,3]] Output: 1 Explanation: [1,3] can be removed and the rest of the intervals are non-overlapping.

Example 2: Input: intervals = [[1,2],[1,2],[1,2]] Output: 2 Explanation: You need to remove two [1,2] to make the rest of the intervals non-overlapping.

Example 3: Input: intervals = [[1,2],[2,3]] Output: 0 Explanation: You don't need to remove any of the intervals since they're already non-overlapping.

Constraints: 1 <= intervals.length <= 10<sup>5</sup> intervals[i].length == 2 -5 * 10<sup>4</sup> <= start<sub>i</sub> < end<sub>i</sub> <= 5 * 10<sup>4</sup>

Sample Answer
from typing import List

class Solution:
    def eraseOverlapIntervals(self, intervals: List[List[int]]) -> int:
        # Sort the intervals based on their end times
        intervals.sort(key=lambda x: x[1])
        
        count = 0  # Number of intervals to remove
        end = float('-inf')  # End time of the previously selected non-overlapping interval
        
        for interval in intervals:
            start, curr_end = interval
            
            if start >= end:
                # Non-overlapping interval found, update the end time
                end = curr_end
            else:
                # Overlapping interval found, increment the count
                count += 1
        
        return count

Explanation:

The problem requires finding the minimum number of intervals to remove such that the remaining intervals are non-overlapping. A greedy approach can solve this problem efficiently.

  1. Sorting: Sort the intervals based on their end times. This ensures that we prioritize intervals that finish earlier, maximizing the number of non-overlapping intervals we can keep.
  2. Greedy Selection: Iterate through the sorted intervals. If the current interval's start time is greater than or equal to the end time of the previously selected interval, it means they are non-overlapping. In this case, we select the current interval and update the end time. Otherwise, we increment the count of intervals to remove.

Example:

Consider the input intervals = [[1,2],[2,3],[3,4],[1,3]].

  1. After sorting by end times, we get [[1,2],[2,3],[1,3],[3,4]].
  2. Initialize count = 0 and end = float('-inf').
  3. Iterate through the sorted intervals:
    • [1,2]: start=1, end=float('-inf'). Since 1 >= float('-inf'), update end = 2.
    • [2,3]: start=2, end=2. Since 2 >= 2, update end = 3.
    • [1,3]: start=1, end=3. Since 1 < 3, increment count = 1.
    • [3,4]: start=3, end=3. Since 3 >= 3, update end = 4.
  4. Return count = 1.

Big(O) Run-time Analysis:

  • Sorting: The dominant operation is sorting the intervals, which takes O(n log n) time, where n is the number of intervals.
  • Iteration: Iterating through the sorted intervals takes O(n) time.
  • Therefore, the overall time complexity is O(n log n).

Big(O) Space Usage Analysis:

  • The space complexity is O(1) (or O(n) depending on the sorting algorithm implementation) because we are only using a few extra variables. Sorting can be done in place, but some sorting algorithms might require O(n) extra space.

Edge Cases:

  1. Empty Input: If the input list of intervals is empty, no intervals need to be removed, so the function should return 0.
  2. Single Interval: If there is only one interval, no intervals need to be removed, so the function should return 0.
  3. Overlapping Intervals: The function should correctly handle cases where multiple intervals overlap.
  4. Intervals with Same End Time: The function should handle cases where intervals have the same end time correctly by prioritizing intervals with earlier end times.
  5. Large Number of Intervals: The function should be efficient enough to handle a large number of intervals (up to 10^5) without exceeding the time limit.