Taro Logo

K-diff Pairs in an Array

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+2
More companies
Profile picture
Profile picture
23 views
Topics:
ArraysTwo PointersStrings

Given an array of integers nums and an integer k, return the number of unique k-diff pairs in the array.

A k-diff pair is an integer pair (nums[i], nums[j]), where the following are true:

  • 0 <= i, j < nums.length
  • i != j
  • |nums[i] - nums[j]| == k

Notice that |val| denotes the absolute value of val.

Example 1:

Input: nums = [3,1,4,1,5], k = 2
Output: 2
Explanation: There are two 2-diff pairs in the array, (1, 3) and (3, 5).
Although we have two 1s in the input, we should only return the number of unique pairs.

Example 2:

Input: nums = [1,2,3,4,5], k = 1
Output: 4
Explanation: There are four 1-diff pairs in the array, (1, 2), (2, 3), (3, 4) and (4, 5).

Example 3:

Input: nums = [1,3,1,5,4], k = 0
Output: 1
Explanation: There is one 0-diff pair in the array, (1, 1).

Constraints:

  • 1 <= nums.length <= 104
  • -107 <= nums[i] <= 107
  • 0 <= k <= 107

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. When counting unique pairs, is the pair `(x, y)` considered identical to the pair `(y, x)`?
  2. If a specific pair of values, for example `{1, 3}`, can be formed in multiple ways due to duplicate numbers in the input like `nums = [1, 1, 3]`, should this be counted as a single unique pair?
  3. For the edge case where `k = 0`, the problem is about finding pairs of identical numbers. Should I interpret this as counting the number of unique values that appear more than once in the array?
  4. Am I permitted to modify the input array, for instance by sorting it, or should I treat it as immutable and use additional space if needed?
  5. The constraints state that `k` is non-negative. Can I safely assume `k` will always be zero or positive, and not handle negative `k` values?

Brute Force Solution

Approach

The most straightforward way to solve this is to systematically create every possible pair of numbers from the list. We then check each of these pairs to see if their difference matches the required value.

Here's how the algorithm would work step-by-step:

  1. First, pick a number from the collection.
  2. Then, compare that number with every other number that comes after it in the collection, creating a pair.
  3. For each pair you form, calculate the difference between the two numbers.
  4. If that difference is exactly the special value we are looking for, we count it as a successful pair.
  5. To get the final answer, we need to keep track of the unique pairs we find, so we don't count the same combination of numbers more than once.
  6. Repeat this entire process, starting with each number in the collection until all possible pairs have been checked.
  7. The final result is the total count of the unique successful pairs you've found.

Code Implementation

def count_k_diff_pairs_brute_force(collection_of_numbers, target_difference):
    # We use a set to store found pairs, which automatically handles uniqueness.

    unique_valid_pairs = set()

    for first_number_index in range(len(collection_of_numbers)):
        # To avoid redundant comparisons, we only check pairs where the second number appears after the first.

        for second_number_index in range(first_number_index + 1, len(collection_of_numbers)):
            first_number = collection_of_numbers[first_number_index]
            second_number = collection_of_numbers[second_number_index]

            if abs(first_number - second_number) == target_difference:
                # Sorting the pair ensures (a, b) and (b, a) are treated as one unique pair.

                canonical_pair_representation = tuple(sorted((first_number, second_number)))
                unique_valid_pairs.add(canonical_pair_representation)

    return len(unique_valid_pairs)

Big(O) Analysis

Time Complexity
O(n²)The primary cost is driven by generating and checking every possible pair from an input of size n. An outer loop iterates through each number, and for each of these, a nested inner loop compares it against every subsequent number in the collection. This means for the first number we perform n-1 comparisons, for the second n-2, and so on. The total number of operations is the sum 1 + 2 + ... + (n-1), which is approximately n * n / 2 and simplifies to O(n²).
Space Complexity
O(N)The space complexity is driven by the step requiring to 'keep track of the unique pairs we find'. This implies an auxiliary data structure, like a hash set, to store the identified pairs to prevent duplicates. Let N be the number of elements in the input collection; in the worst-case scenario, the number of unique pairs can be proportional to N, for example, if the input is [1, 2, ..., N] and k is 1, we would store N-1 unique pairs. Therefore, the memory required for this storage grows linearly with the input size, resulting in a space complexity of O(N).

Optimal Solution

Approach

The optimal approach avoids a slow, brute-force check of every possible pair. Instead, it first creates a quick count of all the numbers in the list, which allows us to then simply look up if a number's required partner—the one that makes the target difference—exists.

Here's how the algorithm would work step-by-step:

  1. Start by making an inventory of the numbers. Go through the entire list and create a tally of how many times each distinct number appears.
  2. The next step depends on the target difference, which we'll call 'k'.
  3. If 'k' is a positive number, look at each distinct number from your inventory. For each one, calculate what its partner's value would have to be (the number itself plus 'k').
  4. With this partner value in mind, simply check your inventory to see if that number exists at all. If it does, you've found a valid pair.
  5. If 'k' is zero, you're looking for pairs of identical numbers. This is even simpler: just look through your inventory and count how many numbers appeared more than once.
  6. By making the inventory first, you only need to go through your list of distinct numbers once to find all the pairs, which saves a lot of time compared to checking every combination.

Code Implementation

def findPairs(numbers_list, required_difference):
    # A negative difference is impossible, so we can immediately return zero.
    
    if required_difference < 0:
        return 0

    # A zero difference requires a special case to count duplicate numbers.
    
    if required_difference == 0:
        number_counts = {}
        for number in numbers_list:
            number_counts[number] = number_counts.get(number, 0) + 1
        
        pair_count = 0
        for count in number_counts.values():
            if count > 1:
                pair_count += 1
        return pair_count
    
    # For positive differences, a set is efficient for finding unique numbers and their partners.
    
    unique_numbers_roster = set(numbers_list)
    pair_count = 0
    for number in unique_numbers_roster:
        required_partner = number + required_difference
        if required_partner in unique_numbers_roster:
            pair_count += 1
            
    return pair_count

Big(O) Analysis

Time Complexity
O(n)The time complexity is determined by two main, non-nested operations that scale with the input size, n. First, we iterate through the entire array of n elements once to build a frequency map, which takes O(n) time. Afterwards, we iterate through the unique keys of this map; in the worst-case scenario, this is also n iterations. For each unique number, we perform a constant-time O(1) lookup to check for its pair. The total operations are approximately n for building the map plus n for finding the pairs, which sums to 2n and simplifies to O(n).
Space Complexity
O(N)The primary auxiliary space is used to create the 'inventory' of numbers, which is implemented as a hash map to store the frequency of each number. In the worst-case scenario, where all N elements in the input array are distinct, this hash map will need to store N key-value pairs. This means the space required is directly proportional to the number of unique elements, which can be up to N. Thus, the space complexity is O(N).

Edge Cases

The difference k is zero
How to Handle:
This case requires special logic to count the number of unique elements that appear more than once in the array.
Input array contains duplicate numbers with k > 0
How to Handle:
The solution must count pairs of values only once, for example by processing only unique numbers from the input.
The input array size is at its maximum limit (10^4)
How to Handle:
An efficient approach with O(N) or O(N log N) time complexity is necessary to avoid a timeout.
Input numbers span a very wide range (e.g., -10^7 to 10^7)
How to Handle:
Using a hash map is memory-efficient compared to a direct-addressing array which would be too large.
All array elements are identical
How to Handle:
The solution should correctly return 1 if k is 0 and 0 for any k > 0.
Input array contains only one element
How to Handle:
The algorithm must return 0 since no pair can be formed.
No pairs with the specified difference k exist
How to Handle:
The algorithm should complete its search and correctly return a final count of 0.
Input contains negative numbers and zero
How to Handle:
The chosen hash-based or sorting-based algorithm should handle negative numbers correctly without special casing.