Taro Logo

Contains Duplicate

Easy
Netflix logo
Netflix
5 views
Topics:
Arrays

Given an integer array nums, return true if any value appears at least twice in the array, and return false if every element is distinct.

Example 1:

Input: nums = [1,2,3,1]

Output: true

Explanation:

The element 1 occurs at the indices 0 and 3.

Example 2:

Input: nums = [1,2,3,4]

Output: false

Explanation:

All elements are distinct.

Example 3:

Input: nums = [1,1,1,3,3,4,3,2,4,2]

Output: true

Constraints:

  • 1 <= nums.length <= 105
  • -109 <= nums[i] <= 109

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for the integers in the input array?
  2. Can the input array be empty or null?
  3. Are we concerned with space complexity, or should I focus primarily on time complexity?
  4. If the array contains duplicate values, should the function return after finding the first duplicate?
  5. Can I assume the input will always be an array of integers, or could there be other data types present?

Brute Force Solution

Approach

The brute force way to check for duplicates is very straightforward. It involves comparing each item in the collection to every other item. If we find two identical items, we know there's a duplicate.

Here's how the algorithm would work step-by-step:

  1. Take the first item in the collection.
  2. Compare this item to every other item in the collection.
  3. If you find an item that is the same, you've found a duplicate and can stop.
  4. If you go through the entire collection and don't find a match, move on to the second item.
  5. Compare the second item to all the remaining items in the collection (excluding the first one, since it's already been checked).
  6. Continue this process, checking each item against all the items that come after it.
  7. If you reach the end without finding any matches, then there are no duplicates.

Code Implementation

def contains_duplicate_brute_force(numbers):
    list_length = len(numbers)

    for first_index in range(list_length):
        first_number = numbers[first_index]

        # Compare the first number with the rest of the numbers in the list
        for second_index in range(first_index + 1, list_length):
            second_number = numbers[second_index]

            # Check if the numbers are the same
            if first_number == second_number:
                # Found a duplicate, so we can return True immediately
                return True

    # No duplicates were found after checking all pairs
    return False

Big(O) Analysis

Time Complexity
O(n²)The brute force approach compares each of the n elements in the input array to every other element. The outer loop iterates n times. The inner loop, in the worst case, iterates close to n times for each iteration of the outer loop. Therefore, the total number of comparisons is roughly proportional to n multiplied by n. This n * n operations means the algorithm has a time complexity of O(n²).
Space Complexity
O(1)The provided brute force algorithm for detecting duplicates only uses a few index variables to iterate through the input array. The number of index variables, such as those used for comparing elements, remains constant regardless of the size of the input array, denoted as N. No auxiliary data structures like hash maps, sets, or additional arrays are used to store intermediate data. Therefore, the space complexity is constant, independent of the input size N.

Optimal Solution

Approach

The core idea is to efficiently remember if we've seen a number before. Instead of comparing each number to every other number, we use a tool to keep track of numbers we've already encountered, making the process much faster.

Here's how the algorithm would work step-by-step:

  1. Imagine having a special notebook. We'll use this notebook to write down each number we see.
  2. Start with the first number in the list.
  3. Check if that number is already in our notebook. If it is, that means we've found a duplicate, and we're done!
  4. If the number isn't in the notebook, write it down. This way, we'll remember we've seen it.
  5. Move on to the next number in the list and repeat the process: check the notebook, and if it's new, add it.
  6. Keep going until you find a duplicate or you reach the end of the list. If you reach the end without finding any duplicates, then there aren't any.

Code Implementation

def containsDuplicate(numbers):
    seen_numbers = set()

    for number in numbers:
        # Check if the current number exists in seen_numbers.
        if number in seen_numbers:

            return True

        # Add number to seen_numbers set.
        seen_numbers.add(number)

    return False

Big(O) Analysis

Time Complexity
O(n)We iterate through the input array nums of size n once. For each number, we check if it exists in our "notebook" (a hash set in practice). Hash set lookups and insertions take constant time on average, O(1). Therefore, the dominant operation is iterating through the array once, which takes O(n) time. Thus, the overall time complexity is O(n).
Space Complexity
O(N)The provided solution uses a 'notebook' to remember numbers we've seen. This 'notebook' is essentially a data structure that stores elements from the input list. In the worst-case scenario, where all the elements in the input array are unique, the 'notebook' will store all N elements. Therefore, the auxiliary space required grows linearly with the input size N. This results in a space complexity of O(N).

Edge Cases

CaseHow to Handle
Null or empty input arrayReturn false immediately since an empty array cannot contain duplicates.
Array with only one elementReturn false because a single element cannot be a duplicate of itself.
Array with all identical elementsThe solution should correctly identify the presence of duplicates and return true.
Array with a very large number of elements (performance)Using a hash set provides O(n) time complexity which scales well for large inputs.
Array containing only negative numbersThe solution should handle negative numbers correctly; a hash set works regardless of number sign.
Array containing a mix of positive, negative, and zero valuesThe solution must correctly handle a mix of different number types without issues.
Array contains integer overflow values (Integer.MAX_VALUE, Integer.MIN_VALUE)The hash set should be able to store and compare extreme integer values correctly.
Array contains duplicate entries of Integer.MAX_VALUE or Integer.MIN_VALUEHash set handles the duplicates of boundary integers correctly.