Taro Logo

Random Pick Index

Medium
Google logo
Google
4 views
Topics:
Arrays

Given an integer array nums with possible duplicates, randomly output the index of a given target number. You can assume that the given target number must exist in the array.

Implement the Solution class:

  • Solution(int[] nums) Initializes the object with the array nums.
  • int pick(int target) Picks a random index i from nums where nums[i] == target. If there are multiple valid i's, then each index should have an equal probability of returning.

Example 1:

Input
["Solution", "pick", "pick", "pick"]
[[[1, 2, 3, 3, 3]], [3], [1], [3]]
Output
[null, 4, 0, 2]

Explanation
Solution solution = new Solution([1, 2, 3, 3, 3]);
solution.pick(3); // It should return either index 2, 3, or 4 randomly. Each index should have equal probability of returning.
solution.pick(1); // It should return 0. Since in the array only nums[0] is equal to 1.
solution.pick(3); // It should return either index 2, 3, or 4 randomly. Each index should have equal probability of returning.

Constraints:

  • 1 <= nums.length <= 2 * 104
  • -231 <= nums[i] <= 231 - 1
  • target is an integer from nums.
  • At most 104 calls will be made to pick.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values within the input array, and is it possible to have negative numbers or zero?
  2. If the target value appears multiple times in the array, are you looking for a uniformly random selection of the indices, or is there a specific bias I should be aware of?
  3. What should I return if the target value is not present in the input array?
  4. How large can the input array be, and how many times will the `pick` function be called, roughly, in order to optimize my solution's performance?
  5. Is the input array mutable? In other words, can I modify the array during initialization?

Brute Force Solution

Approach

The brute force method for this problem is straightforward. We will look through the entire collection of numbers every time we need to pick a random spot where our special number is located. This approach guarantees we find a suitable position if it exists.

Here's how the algorithm would work step-by-step:

  1. Whenever we need to pick a random position for our special number, we will start from the very beginning of the entire collection of numbers.
  2. We'll go through each number in the collection, one by one.
  3. If we find a number that matches our special number, we'll remember its position.
  4. After checking every single number in the collection, we'll have a list of all the positions where our special number appears.
  5. Finally, we'll pick one of these positions at random and return it.

Code Implementation

import random

class Solution:

    def __init__(self, numbers):
        self.numbers = numbers

    def pick(self, target):
        # Find all indices where the number appears
        target_indices = []

        for index in range(len(self.numbers)):
            if self.numbers[index] == target:

                # Remember this index because value matches.
                target_indices.append(index)

        # Choose a random index from available indices
        chosen_index = random.choice(target_indices)

        return chosen_index

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the entire input array of size n every time the pick function is called. This linear scan is necessary to identify all indices where the target number appears. Even though a random number generator is used to select from the identified indices, the dominant factor in the time complexity is still the initial iteration across all n elements. Therefore the time complexity of this approach is O(n).
Space Complexity
O(N)The described brute force approach involves creating a list to store all positions where the target number appears. In the worst-case scenario, where every element in the input array equals the target number, this list will contain N indices, where N is the size of the input array. Thus, the space required scales linearly with the input size. Therefore, the auxiliary space complexity is O(N).

Optimal Solution

Approach

The challenge is to pick a random spot that contains a specific value. We don't want to store every location of the value because that would take too much space. Instead, we'll use a clever trick to pick a random spot on-the-fly as we look at the input.

Here's how the algorithm would work step-by-step:

  1. Go through the input, one item at a time.
  2. Keep a count of how many times you've seen the specific value you're looking for.
  3. Each time you find the value, pick a random number between 1 and the current count.
  4. If the random number is 1, then remember this spot as the chosen one.
  5. Continue through the entire input. The last chosen spot will be the randomly selected one.

Code Implementation

import random

class Solution:

    def __init__(self, numbers):
        self.numbers = numbers

    def pick(self, target):
        chosen_index = -1
        occurrence_count = 0

        for index, number in enumerate(self.numbers):
            if number == target:
                occurrence_count += 1

                # Reservoir sampling: replace existing choice with
                # probability 1/occurrence_count
                if random.randint(1, occurrence_count) == 1:
                    chosen_index = index

        return chosen_index

Big(O) Analysis

Time Complexity
O(n)The algorithm iterates through the input array nums of size n exactly once. Inside the loop, a constant-time random number generation and comparison are performed. The number of times the random number generator is called depends on the frequency of the target value, but even in the worst case (where the target is every element), it's still a constant-time operation per element. Therefore, the overall time complexity is directly proportional to the size of the input array, resulting in O(n).
Space Complexity
O(1)The algorithm's auxiliary space usage is constant because it only stores a count and the index of the chosen element. Regardless of the size of the input array (N), the number of additional variables remains fixed. Therefore, the space complexity is O(1).

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn -1 immediately as there are no possible indices to pick from.
Array with only one elementAlways return 0 as the only available index when the target matches this single element.
Large array size (scalability)Use Reservoir Sampling for efficiency to avoid storing all indices in memory.
Target value not present in the arrayReturn -1, indicating the target was not found in the array.
Integer overflow when calculating random indexEnsure the random number generation and index calculation are within integer limits or use a larger data type.
Array contains a large number of occurrences of the targetReservoir sampling will handle this correctly, giving each occurrence an equal probability of being selected.
The target value is near the min or max integer value.No special handling needed, as the algorithm searches directly for the numerical target value.
Random number generator has poor randomness.Use a well-established and tested random number generator for even index distribution.