Taro Logo

Merge Intervals

Medium
Walmart logo
Walmart
1 view
Topics:
ArraysGreedy Algorithms

Given an array of intervals where intervals[i] = [starti, endi], merge all overlapping intervals, and return an array of the non-overlapping intervals that cover all the intervals in the input.

Example 1:

Input: intervals = [[1,3],[2,6],[8,10],[15,18]]
Output: [[1,6],[8,10],[15,18]]
Explanation: Since intervals [1,3] and [2,6] overlap, merge them into [1,6].

Example 2:

Input: intervals = [[1,4],[4,5]]
Output: [[1,5]]
Explanation: Intervals [1,4] and [4,5] are considered overlapping.

Constraints:

  • 1 <= intervals.length <= 104
  • intervals[i].length == 2
  • 0 <= starti <= endi <= 104

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected data type of the interval start and end times (e.g., integers)? Are negative or non-integer values possible?
  2. Can the input array of intervals be empty or null? If so, what should I return?
  3. If intervals overlap, what is the precise definition of 'overlap'? For example, does `[1,3]` overlap with `[3,5]`?
  4. Is the order of the merged intervals in the output important? If so, what ordering should I use (e.g., sorted by start time)?
  5. Are the intervals guaranteed to be sorted or will they be in random order?

Brute Force Solution

Approach

We want to combine overlapping time periods. The most straightforward way is to look at every possible pair of time periods and see if they overlap, combining them if they do.

Here's how the algorithm would work step-by-step:

  1. For each time period, compare it to every other time period.
  2. If two time periods overlap, create a new time period that covers both of them.
  3. Replace the original two time periods with the new combined one.
  4. Repeat this process, comparing each time period with every other time period, combining where necessary.
  5. Keep doing this until you've gone through all the time periods and no more combining can be done.
  6. You will be left with a final set of time periods, where none of them overlap.

Code Implementation

def merge_intervals_brute_force(intervals):
    intervals_list = intervals[:] # Copy to avoid modifying the original
    number_of_intervals = len(intervals_list)

    while True:
        merged_at_least_once = False

        for first_interval_index in range(number_of_intervals):
            for second_interval_index in range(first_interval_index + 1, number_of_intervals):
                first_interval = intervals_list[first_interval_index]
                second_interval = intervals_list[second_interval_index]

                # Check if intervals overlap
                if (first_interval[0] <= second_interval[1] and first_interval[1] >= second_interval[0]):

                    # Create merged interval
                    merged_interval_start = min(first_interval[0], second_interval[0])
                    merged_interval_end = max(first_interval[1], second_interval[1])
                    merged_interval = [merged_interval_start, merged_interval_end]

                    # Replace the first interval with the merged interval
                    intervals_list[first_interval_index] = merged_interval

                    # Remove the second interval
                    del intervals_list[second_interval_index]
                    number_of_intervals -= 1
                    merged_at_least_once = True
                    break # Restart the inner loop

            if merged_at_least_once:
                break # Restart the outer loop

        # No merges occurred, so break
        if not merged_at_least_once:
            break

    return intervals_list

Big(O) Analysis

Time Complexity
O(n³)The algorithm iterates through each of the n intervals and compares it with every other interval, resulting in a nested loop. If intervals overlap, they are merged, and the outer loop restarts to ensure the newly merged interval is compared with all other intervals. In the worst case, merging could occur in each iteration of the outer loop, leading to another iteration through the intervals. Thus, the cost is driven by potentially comparing and merging in nested loops, with the outer loop potentially iterating n times. This equates to approximately n * n * n/2 operations, which simplifies to O(n³).
Space Complexity
O(1)The provided algorithm operates in-place, modifying the original list of time periods directly. Although new time periods are created when overlaps are found, they immediately replace the original overlapping intervals. Therefore, no extra data structures, such as temporary lists or hash maps that scale with the input size N (the number of time periods), are used. The space required remains constant, irrespective of the input size.

Optimal Solution

Approach

The most efficient way to combine overlapping time periods is to first arrange them in order based on when they start. Then, we go through them one by one, merging them as we go.

Here's how the algorithm would work step-by-step:

  1. First, put all the time periods in order from earliest start time to latest start time.
  2. Take the first time period and compare it to the next one.
  3. If the second time period starts before the first one ends, combine them into a single, larger time period.
  4. If the second time period starts after the first one ends, the first time period is complete, and we move on to the second time period.
  5. Continue comparing and merging time periods until we've checked all of them.
  6. The time periods that are left are the combined, non-overlapping time periods.

Code Implementation

def merge_intervals(intervals):
    # Sort intervals by start time.
    intervals.sort(key=lambda interval: interval[0])

    merged_intervals = []
    
    for interval in intervals:
        # If the list of merged intervals is empty or if the current
        # interval does not overlap with the last interval, append it.
        if not merged_intervals or merged_intervals[-1][1] < interval[0]:
            merged_intervals.append(interval)
        else:
            # Otherwise, there is overlap, so we merge the current interval
            # with the last one.
            merged_intervals[-1][1] = max(merged_intervals[-1][1], interval[1])

    return merged_intervals

Big(O) Analysis

Time Complexity
O(n log n)The initial step involves sorting the input intervals based on their start times, which takes O(n log n) time. Subsequently, we iterate through the sorted intervals once, performing constant-time comparisons and merges. This linear pass contributes O(n) time. Since sorting dominates the time complexity, the overall time complexity is O(n log n) + O(n), which simplifies to O(n log n).
Space Complexity
O(N)The provided algorithm sorts the input intervals. If the sorting is done in place, it might require O(1) auxiliary space. However, many standard sorting algorithms (like merge sort used in some implementations) take O(N) auxiliary space, where N is the number of intervals. Additionally, we are collecting the merged intervals into a result list. In the worst case where no intervals overlap, this result list will contain all N intervals. Thus, the algorithm uses O(N) auxiliary space to store the sorted intervals (potentially) and the merged interval list, leading to a space complexity of O(N).

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn an empty array or throw an IllegalArgumentException depending on the requirements.
Input array with only one intervalReturn the input array directly as there's nothing to merge.
Intervals are already non-overlapping and sortedThe algorithm should still correctly return the original array after sorting (if sorting is part of the solution).
Intervals are in reverse order and overlapping heavily.The sorting step will handle reordering, and subsequent merging logic should correctly combine them.
Intervals with large start and end values (potential for integer overflow if calculating lengths)Ensure comparisons and calculations avoid overflow, potentially using long data types or careful subtraction.
Intervals with identical start and end times (e.g., [1, 1], [2, 2])The merging logic should treat these as valid intervals and handle them appropriately.
A very large number of overlapping intervals, potentially impacting performance.Ensure the sorting algorithm used is efficient (O(n log n)), and the merging process is linear, preventing quadratic time complexity.
Intervals with start time equal to another interval's end time (e.g., [1, 2], [2, 3])The problem description implies these should be merged so this case should result in [1,3].