The frequency of each number is given in a table format where the first column is the number and the second column is its frequency.
Return the median of all numbers.
Example 1:
Input:
[
[1, 3],
[2, 1],
[3, 2]
]
Output: 2
Explanation:
The numbers are 1, 1, 1, 2, 3, 3.
The median is 2.
Example 2:
Input:
[
[1, 2],
[2, 3]
]
Output: 2
Explanation:
The numbers are 1, 1, 2, 2, 2.
The median is 2.
Constraints:
[1, 105]
.[1, 105]
.[1, 105]
.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach to finding the median involves recreating the entire list of numbers from their frequencies. Then, we sort this list and pick the middle number (or average the two middle numbers if the count is even).
Here's how the algorithm would work step-by-step:
def find_median_given_frequency(numbers_with_frequency):
complete_number_list = []
for number, frequency in numbers_with_frequency:
# Expand the list based on frequencies
for _ in range(frequency):
complete_number_list.append(number)
complete_number_list.sort()
list_length = len(complete_number_list)
# Need the middle element for an odd-length list
if list_length % 2 != 0:
median_index = list_length // 2
return complete_number_list[median_index]
# Need to average two middle elements for even length
else:
middle_index_2 = list_length // 2
middle_index_1 = middle_index_2 - 1
#Calculate the average of the two middle numbers
median = (complete_number_list[middle_index_1] + \
complete_number_list[middle_index_2]) / 2
return median
The best way to find the median is to avoid creating a giant list of all the numbers. We can use the frequency information to efficiently jump to the middle value(s) without explicitly listing out every number.
Here's how the algorithm would work step-by-step:
def find_median_given_frequency(frequency_map):
total_numbers = sum(frequency_map.values())
middle_position = (total_numbers + 1) // 2
count = 0
sorted_numbers = sorted(frequency_map.keys())
for i in range(len(sorted_numbers)):
number = sorted_numbers[i]
frequency = frequency_map[number]
count += frequency
# Check if we've reached the middle position.
if count >= middle_position:
# If total count is even, average current and next number
if total_numbers % 2 == 0 and count == middle_position:
if i + 1 < len(sorted_numbers):
next_number = sorted_numbers[i + 1]
return (number + next_number) / 2
else:
return number #last element
# If total count is odd, median is the current number
else:
return number
return None # Should not reach here if input is valid
Case | How to Handle |
---|---|
nums or frequencies is null or empty | Return null or throw an exception if either input array is null or empty, indicating invalid input. |
nums and frequencies have different lengths | Throw an exception if the lengths of the input arrays are different, as this indicates invalid input. |
Sum of frequencies is zero | Return NaN or throw an exception if the sum of frequencies is zero, indicating an empty dataset. |
Large frequencies causing integer overflow when calculating total count | Use a 64-bit integer type (long in Java/C++) to prevent overflow when calculating the total number of elements. |
All numbers in nums are the same | The standard median calculation should still correctly identify the median value in this scenario. |
Frequencies contain negative values | Throw an exception if any frequency is negative, since frequencies should be non-negative. |
Extreme values in nums leading to precision issues if using floating point arithmetic | Consider using integer arithmetic when possible or using appropriate floating point precision to minimize errors. |
Even number of total elements requiring average of two middle elements | Correctly identify the two middle elements and compute their average according to the problem definition. |