Taro Logo

Russian Doll Envelopes

Medium
144 views
2 months ago

You are given a 2D array of integers envelopes where envelopes[i] = [wᵢ, hᵢ] represents the width and the height of an envelope. One envelope can fit into another if and only if both the width and height of one envelope are greater than the other envelope's width and height. Return the maximum number of envelopes you can Russian doll (i.e., put one inside the other). Note: You cannot rotate an envelope. For example, given envelopes = [[5,4],[6,4],[6,7],[2,3]], the output is 3 because the envelopes can be Russian dolled in the order [2,3] => [5,4] => [6,7]. As another example, given envelopes = [[1,1],[1,1],[1,1]], the output is 1 because no envelope can be placed inside another.

Sample Answer
## Optimal Solution: Dynamic Programming with Binary Search

This problem can be solved using dynamic programming combined with binary search to optimize the process of finding the longest increasing subsequence of heights. The key idea is to first sort the envelopes based on their widths in ascending order. If widths are the same, sort the heights in descending order. This sorting ensures that for envelopes with the same width, we only consider the one with the largest height when constructing the increasing subsequence.

**Algorithm:**

1.  **Sort Envelopes:** Sort the `envelopes` array based on the following criteria:
    *   Ascending order of widths.
    *   If widths are equal, sort in descending order of heights.  This is crucial to prevent counting envelopes with the same width in the increasing subsequence.
2.  **Extract Heights:** Extract the heights from the sorted envelopes into a new array called `heights`.
3.  **Longest Increasing Subsequence (LIS):** Find the LIS of the `heights` array using binary search to optimize the process.
    *   Initialize an empty list called `tails` to store the smallest tail of all increasing subsequences with length `i+1`.
    *   Iterate through the `heights` array:
        *   If `heights[i]` is greater than all tails, extend the longest increasing subsequence by adding `heights[i]` to `tails`.
        *   Otherwise, find the smallest tail that is greater than or equal to `heights[i]` using binary search and replace that tail with `heights[i]`.
4.  **Return LIS Length:** The length of the `tails` array is the length of the LIS, which is the maximum number of envelopes that can be Russian dolled.

**Code:**

```python
import bisect

def maxEnvelopes(envelopes):
    # Sort envelopes by width (ascending) and height (descending if widths are equal)
    envelopes.sort(key=lambda x: (x[0], -x[1]))
    
    # Extract heights
    heights = [env[1] for env in envelopes]
    
    # Find Longest Increasing Subsequence (LIS) using binary search
    tails = []
    for height in heights:
        if not tails or height > tails[-1]:
            tails.append(height)
        else:
            # Find the smallest tail >= height and replace it
            index = bisect.bisect_left(tails, height)
            tails[index] = height
    
    return len(tails)

# Example usage:
envelopes = [[5,4],[6,4],[6,7],[2,3]]
result = maxEnvelopes(envelopes)
print(f"Maximum number of envelopes: {result}")  # Output: 3

envelopes = [[1,1],[1,1],[1,1]]
result = maxEnvelopes(envelopes)
print(f"Maximum number of envelopes: {result}")  # Output: 1

Big(O) Time Complexity Analysis:

  • Sorting: O(n log n), where n is the number of envelopes.
  • Extracting Heights: O(n)
  • Longest Increasing Subsequence (LIS): O(n log n), because we iterate through the heights array (O(n)) and perform a binary search for each element (O(log n)).

Therefore, the overall time complexity is O(n log n), dominated by the sorting and LIS operations.

Big(O) Space Complexity Analysis:

  • Sorted Envelopes (in-place sort): O(1) (if the sorting algorithm used is in-place, like heapsort).
  • Heights Array: O(n)
  • Tails Array (for LIS): O(n) in the worst case (when the heights array is strictly increasing).

Therefore, the overall space complexity is O(n).

Edge Cases and Handling:

  1. Empty Input: If the envelopes array is empty, return 0.
  2. Null or Invalid Input: Check for null or invalid input and handle appropriately (e.g., raise an exception or return 0).
  3. Duplicate Envelopes: The algorithm handles duplicate envelopes correctly due to the sorting criteria. Envelopes with the same width are sorted in descending order of height, ensuring that only one of them contributes to the increasing subsequence.
  4. Envelopes with Same Width and Height: These are handled correctly by the sorting criteria. The algorithm will only count one of them in the LIS.
  5. Large Input: The algorithm is efficient enough to handle large inputs (up to 10^5 envelopes) due to its O(n log n) time complexity.

Brute Force Solution (for comparison):

A brute force approach would involve checking all possible subsequences of envelopes and finding the longest one that forms a Russian doll sequence. This approach is highly inefficient.

Algorithm:

  1. Generate all possible subsequences of the envelopes array.
  2. For each subsequence, check if it forms a valid Russian doll sequence (i.e., each envelope can fit inside the next one).
  3. Keep track of the longest valid Russian doll sequence.

Code:

def maxEnvelopes_brute_force(envelopes):
    n = len(envelopes)
    max_count = 0

    def is_valid_sequence(seq):
        for i in range(len(seq) - 1):
            if not (seq[i][0] < seq[i+1][0] and seq[i][1] < seq[i+1][1]):
                return False
        return True

    def find_subsequences(index, current_sequence):
        nonlocal max_count
        if index == n:
            if is_valid_sequence(current_sequence):
                max_count = max(max_count, len(current_sequence))
            return

        # Include the current envelope
        find_subsequences(index + 1, current_sequence + [envelopes[index]])

        # Exclude the current envelope
        find_subsequences(index + 1, current_sequence)

    find_subsequences(0, [])
    return max_count

# Example usage:
envelopes = [[5,4],[6,4],[6,7],[2,3]]
result = maxEnvelopes_brute_force(envelopes)
print(f"Maximum number of envelopes (brute force): {result}")  # Output: 3

Time Complexity of Brute Force: O(2^n * n), where n is the number of envelopes. This is because there are 2^n possible subsequences, and for each subsequence, we need to check if it's a valid Russian doll sequence, which takes O(n) time in the worst case.

Space Complexity of Brute Force: O(n), due to the space used to store the subsequences during recursion.

Note: The brute force solution is highly inefficient and will likely time out for larger inputs!