Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper and citations is sorted in non-descending order, return the researcher's h-index.
According to the definition of h-index on Wikipedia: The h-index is defined as the maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.
You must write an algorithm that runs in logarithmic time.
Example 1:
Input: citations = [0,1,3,5,6] Output: 3 Explanation: [0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
Example 2:
Input: citations = [1,2,100] Output: 2
Constraints:
n == citations.length1 <= n <= 1050 <= citations[i] <= 1000citations is sorted in ascending order.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
To find the H-Index, we'll try out different possible H-Index values. For each guess, we will check if the array satisfies the H-Index condition, by checking each number of publications. This means trying every possibility until we find the largest possible H-Index that works.
Here's how the algorithm would work step-by-step:
def h_index_brute_force(citations):
number_of_publications = len(citations)
# Start with the largest possible h-index
for possible_h_index in range(number_of_publications, -1, -1):
# Check if the array satisfies the condition for this h-index
publications_with_enough_citations = 0
for citation_count in citations:
if citation_count >= possible_h_index:
publications_with_enough_citations += 1
# If enough publications have enough citations, return h-index
if publications_with_enough_citations >= possible_h_index:
# We've found the largest possible h-index
return possible_h_index
return 0This problem asks us to find a special number related to someone's research publications. Because the publications are already helpfully sorted, we can use this sorted order to quickly narrow down the possibilities, rather than checking every single number.
Here's how the algorithm would work step-by-step:
def h_index_ii(citations):
start_index = 0
end_index = len(citations) - 1
while start_index <= end_index:
middle_index = (start_index + end_index) // 2
# Publications beyond middle may be >= middle citation
papers_beyond_middle = len(citations) - middle_index
if citations[middle_index] >= papers_beyond_middle:
# H-index might be larger; check the left half
end_index = middle_index - 1
else:
# H-index must be smaller; check the right half
start_index = middle_index + 1
# Remaining publications are beyond h-index
return len(citations) - start_index| Case | How to Handle |
|---|---|
| Empty citations array | Return 0 as the h-index since there are no publications. |
| Citations array with a single element | Return 1 if the single citation is >= 1, else return 0. |
| All citations are 0 | Return 0 as the h-index since no paper has a sufficient number of citations. |
| All citations are very large (e.g., close to integer max) | The binary search should still function correctly as the citation values do not affect the algorithm's core logic. |
| Citations array is in strictly increasing order | The binary search will efficiently find the h-index in this sorted scenario. |
| Citations array is in strictly decreasing order (or all the same value) | The binary search will still correctly identify the h-index, potentially at the beginning of the array. |
| Very large input array (potential for integer overflow in calculations) | Ensure that the middle calculation in binary search uses `mid = low + (high - low) / 2` to prevent integer overflow. |
| No h-index exists (e.g., citations = [0,0,0]) | The binary search should naturally converge to a low value (0), indicating no valid h-index. |