Taro Logo

Search in a Sorted Array of Unknown Size

Medium
Google logo
Google
1 view
Topics:
ArraysBinary Search

You are given an integer array sorted in ascending order, but you do not know the array's length. You can access the array only through an ArrayReader interface, where ArrayReader.get(k) returns the k-th element of the array (0-indexed). If you try to access an index out of bounds, it returns 2147483647.

Your task is to implement a function to search for a target value in the array. If the target exists, return its index; otherwise, return -1. The algorithm's runtime complexity should be O(log k) where k is the index of target value.

Example:

Consider the following sorted array: [-1, 0, 3, 5, 9, 12]

  1. If target = 9, the function should return 4.
  2. If target = 2, the function should return -1.

Constraints:

  • All elements of the array are less than 10000.
  • If you access the array out of bounds, ArrayReader.get() returns 2147483647.
  • You must write an algorithm with O(log k) runtime complexity. k is the index of target value.

Solution


Search in a Sorted Array of Unknown Size

Naive Approach: Linear Search

The most straightforward approach is to perform a linear search. We start from index 0 and increment the index until we find the target or encounter a boundary condition (e.g., ArrayReader.get(index) returns a value greater than the target, suggesting the target is not in the array, or we hit an error implying the end of valid indices).

Code:

def search_in_unknown_size_linear(reader, target):
    index = 0
    while True:
        value = reader.get(index)
        if value == target:
            return index
        elif value > target:
            return -1
        elif value == 2147483647: # end of array, as specified by LeetCode
            return -1
        index += 1

Time Complexity:

O(N), where N is the index of the target (or the index where we determine the target is absent).

Space Complexity:

O(1), as we use only a constant amount of extra space.

Optimal Approach: Exponential Backoff + Binary Search

The optimal solution combines exponential backoff to find a suitable range for binary search, followed by binary search within that range.

  1. Exponential Backoff: Start with an index of 1. Double the index until reader.get(index) returns a value greater than or equal to the target, or reader.get(index) returns 2147483647 (out of bounds).
  2. Binary Search: Once we have a range (from index/2 to index), perform a standard binary search to find the target within that range.

Code:

def search_in_unknown_size(reader, target):
    # Exponential backoff to find the range
    low = 0
    high = 1
    while reader.get(high) < target:
        low = high
        high *= 2

    # Binary search within the range [low, high]
    while low <= high:
        mid = low + (high - low) // 2
        value = reader.get(mid)

        if value == target:
            return mid
        elif value < target:
            low = mid + 1
        elif value > target or value == 2147483647:
            high = mid - 1

    return -1

Time Complexity:

O(log N), where N is the index of the target. Exponential backoff takes O(log N) time to find the range, and binary search takes O(log N) time to search within that range. O(log N) + O(log N) simplifies to O(log N).

Space Complexity:

O(1), as we use only a constant amount of extra space for variables.

Edge Cases:

  • Target Not Found: The algorithm should return -1 if the target is not present in the array (either due to being smaller than the first element or larger than any element present).
  • Empty Array (Hypothetical): Although the problem doesn't explicitly state an empty array, the approach works correctly if the array is effectively empty based on the ArrayReader's behavior.
  • Large Target Values: The algorithm should handle cases where the target value is very large and might approach the maximum integer value.
  • ArrayReader.get() returning out of bounds: The ArrayReader.get() function is specified to return 2147483647 if it exceeds the bounds. The algorithm must check for this case.