You are given a 0-indexed array of positive integers w
where w[i]
describes the weight of the i<sup>th</sup>
index.
You need to implement the function pickIndex()
, which randomly picks an index in the range [0, w.length - 1]
(inclusive) and returns it. The probability of picking an index i
is w[i] / sum(w)
.
w = [1, 3]
, the probability of picking index 0
is 1 / (1 + 3) = 0.25
(i.e., 25%
), and the probability of picking index 1
is 3 / (1 + 3) = 0.75
(i.e., 75%
).Could you provide a solution that efficiently implements the pickIndex()
function, considering the constraints on the input size and the number of calls to pickIndex()
?
The problem requires implementing a pickIndex()
function that randomly selects an index from an array of positive integers, where each integer represents the weight of its corresponding index. The probability of selecting an index i
is proportional to its weight w[i]
divided by the sum of all weights. This ensures indices with higher weights are more likely to be selected.
A naive approach involves creating a new array where each index i
from the original array w
is repeated w[i]
times. Then, a random index is selected from this expanded array. This selected index corresponds to an index in the original array.
Pros:
Cons:
Example:
If w = [1, 3]
, the expanded array would be [0, 1, 1, 1]
. A random number between 0 and 3 is selected. If 0 is selected, we return 0. If 1, 2, or 3 is selected, we return 1.
Code:
import random
class Solution:
def __init__(self, w):
self.expanded_array = []
for i, weight in enumerate(w):
self.expanded_array.extend([i] * weight)
def pickIndex(self):
return random.choice(self.expanded_array)
Complexity Analysis:
__init__
: O(Sum of weights) because we are constructing a new array that contains each index repeated by its weight.pickIndex
: O(1) since picking a random element from an array is a constant-time operation.A more efficient solution involves calculating the prefix sum of the weights. The prefix sum array represents the cumulative weights up to each index. A random number between 1 and the total sum of weights is generated. Binary search is then used to find the index in the prefix sum array where the random number falls. This index is the selected index.
Algorithm:
prefix_sums
. prefix_sums[i]
stores the sum of weights from w[0]
to w[i]
.target
between 1 and the total sum of weights (inclusive).idx
in prefix_sums
such that prefix_sums[idx] >= target
. This idx
is the index to be returned.Example:
If w = [1, 3, 2]
, the prefix sum array would be [1, 4, 6]
. The total sum is 6. Let's say the random number generated is 3. We perform binary search on [1, 4, 6]
to find the smallest index where the value is greater than or equal to 3. This would be index 1 (value 4), so we return 1.
Code:
import random
import bisect
class Solution:
def __init__(self, w):
self.prefix_sums = []
total_sum = 0
for weight in w:
total_sum += weight
self.prefix_sums.append(total_sum)
self.total_sum = total_sum
def pickIndex(self):
target = random.randint(1, self.total_sum)
index = bisect.bisect_left(self.prefix_sums, target)
return index
Complexity Analysis:
__init__
: O(N), where N is the length of the input array w
, to compute the prefix sums.pickIndex
: O(log N), where N is the length of the prefix sums array, due to the binary search.1 <= w.length <= 10^4
, so the input array w
will not be empty.w = [1]
), the algorithm should return 0 every time. The prefix sum approach correctly handles this case.The prefix sum and binary search approach is more efficient than the brute-force approach, especially for large weights. It provides a good balance between time and space complexity, making it a suitable solution for this problem.