Taro Logo

Sum of Absolute Differences in a Sorted Array

Medium
Amazon logo
Amazon
1 view
Topics:
Arrays

You are given an integer array nums sorted in non-decreasing order.

Build and return an integer array result with the same length as nums such that result[i] is equal to the summation of absolute differences between nums[i] and all the other elements in the array.

In other words, result[i] is equal to sum(|nums[i]-nums[j]|) where 0 <= j < nums.length and j != i (0-indexed).

For example:

  • If nums = [2,3,5], the output should be [4,3,5] because:
    • result[0] = |2-2| + |2-3| + |2-5| = 0 + 1 + 3 = 4
    • result[1] = |3-2| + |3-3| + |3-5| = 1 + 0 + 2 = 3
    • result[2] = |5-2| + |5-3| + |5-5| = 3 + 2 + 0 = 5
  • If nums = [1,4,6,8,10], the output should be [24,15,13,15,21]

How would you implement an efficient algorithm to solve this problem? What is the time and space complexity of your solution? Can you identify any edge cases?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of integer values in the input array? Can I expect very large numbers that might cause integer overflow during calculations?
  2. Can the input array be empty or null?
  3. Is the input array guaranteed to be sorted in non-decreasing order, or could it be non-increasing?
  4. Should I return the result as a 64-bit integer if the sum of absolute differences might exceed the capacity of a 32-bit integer?
  5. Could you provide an example input array and the expected output to ensure I fully understand the problem?

Brute Force Solution

Approach

The brute force way to solve this problem is to calculate the sum of absolute differences for each number in the list individually. We're going to look at each number and compare it with all the other numbers.

Here's how the algorithm would work step-by-step:

  1. For the first number in the list, find the difference between it and every other number in the list. Make sure you're only looking at the positive version of each difference.
  2. Add up all those positive differences you found to get a total for the first number.
  3. Repeat the process for the second number in the list: find the positive difference between it and every other number, and add those up.
  4. Keep doing this for every single number in the list, each time calculating the sum of its absolute differences from all the other numbers.
  5. Once you've done this for every number, you'll have your answer: a new list where each entry is the sum of absolute differences for that number from the original list.

Code Implementation

def sum_of_absolute_differences_brute_force(numbers):
    result = []

    for i in range(len(numbers)):
        current_number = numbers[i]
        absolute_difference_sum = 0

        # Calculate the absolute difference with all other numbers
        for j in range(len(numbers)):
            other_number = numbers[j]
            absolute_difference = abs(current_number - other_number)

            # Accumulate sum of absolute differences
            absolute_difference_sum += absolute_difference

        result.append(absolute_difference_sum)

        # Append to result to store the sum

    return result

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through each of the n elements in the input array. For each element, it calculates the absolute difference with every other element in the array, requiring n-1 operations per element. Therefore, the total number of operations is proportional to n * (n-1). This simplifies to approximately n², resulting in a time complexity of O(n²).
Space Complexity
O(1)The provided brute force algorithm calculates the sum of absolute differences for each number in the input list individually. While performing these calculations, it only uses a few variables to store temporary sums and differences. The algorithm does not create any additional data structures like lists or maps whose size scales with the input size N. Therefore, the auxiliary space required is constant, irrespective of the input size, making it O(1).

Optimal Solution

Approach

The key idea is to avoid recalculating differences repeatedly. We leverage the sorted nature of the numbers to maintain a running sum of differences efficiently. This avoids brute-force calculation and optimizes the process significantly.

Here's how the algorithm would work step-by-step:

  1. Imagine you're walking through the sorted list, one number at a time.
  2. At each number, figure out how far it is from all the numbers before it, and how far it is from all the numbers after it.
  3. Instead of calculating those distances individually, keep track of a running sum of all the numbers you've seen so far.
  4. To find the total distance from the current number to all previous numbers, use the running sum. Subtract all previous numbers from the current number multiplied by how many previous numbers exist.
  5. Similarly, to find the total distance from the current number to all following numbers, use another calculation based on the running sum and the remaining numbers.
  6. Add these two distances together to get the total distance for the current number, representing the sum of absolute differences.
  7. Update the running sum by adding the current number to it, and continue to the next number in the list.
  8. Repeat this process until you have the total distance calculated for every number in the list, and sum all of them up to get the overall result.

Code Implementation

def sum_of_absolute_differences(
        numbers):
    total_sum = 0
    result = [0] * len(numbers)
    running_sum = 0

    for index, current_number in enumerate(numbers):
        # Calculate sum of absolute diff. from prior numbers.
        result[index] += (current_number *
                          index) - running_sum

        # Store prior numbers to calculate sum from future numbers
        running_sum += current_number

    running_sum = 0
    for index in range(len(numbers) - 1, -1, -1):
        # We process the numbers in reverse
        current_number = numbers[index]
        
        # Calculate sum of absolute differences from future numbers
        result[index] += running_sum - (
            current_number * (len(numbers) - 1 - index))

        running_sum += current_number

        # Sum the results to get the final absolute difference
        total_sum += result[index]

    return total_sum

Big(O) Analysis

Time Complexity
O(n)We iterate through the sorted array of n elements once. Inside the loop, we perform constant-time operations like calculating the sum of differences using a running sum and updating the running sum itself. Because the number of operations within the loop is independent of n and the loop runs n times, the overall time complexity is O(n).
Space Complexity
O(N)The algorithm calculates the sum of absolute differences for each number in the input array of size N and stores these sums in an output array. This requires allocating an additional array of size N to hold the results. Therefore, the auxiliary space used scales linearly with the input size N. The running sum variable uses constant space.

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn an empty array or zero, depending on the requirements, as there are no elements to sum differences from.
Array with a single elementReturn an array with a single zero since the absolute difference of a number with itself is zero.
Array with all identical valuesThe result array will contain all zeros since all absolute differences are zero.
Large array with potentially integer overflow when calculating sumsUse a 64-bit integer type (long in Java/C++, int64 in Python) to store the sums to prevent overflow.
Array with negative numbersThe algorithm should work correctly with negative numbers as the absolute difference handles both positive and negative differences.
Array with extreme boundary values (e.g., Integer.MAX_VALUE, Integer.MIN_VALUE)Ensure that intermediate calculations involving these values do not cause overflow, potentially using long or alternative representations.
Array already contains a zeroThe presence of zero should not change the logic as it is simply another number from which absolute differences are computed.
Large input array size impacting memory constraintsThe chosen approach (prefix sums) is efficient, so this should not generally be a problem if implemented correctly and avoiding unnecessary memory allocation.