Taro Logo

Shortest Word Distance III

Medium
Palantir logo
Palantir
0 views
Topics:
ArraysTwo Pointers

Given an array of strings wordsDict and two strings that already exist in the array word1 and word2, return the shortest distance between these two words in the list.

Note:

  • word1 and word2 may be the same.
  • You may assume word1 and word2 are both in the list.

Example 1:

Input: wordsDict = ["practice", "makes", "perfect", "coding", "makes"], word1 = "makes", word2 = "coding"
Output: 1

Example 2:

Input: wordsDict = ["practice", "makes", "perfect", "coding", "makes"], word1 = "makes", word2 = "makes"
Output: 3

Constraints:

  • 1 <= wordsDict.length <= 105
  • 1 <= wordsDict[i].length <= 10
  • wordsDict[i] consists of lowercase English letters.
  • word1 and word2 are in wordsDict.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the `words` array be empty or contain null/empty strings?
  2. Is the target word guaranteed to be present in the `words` array at least once? What should be returned if the target is not found?
  3. What is the maximum length of the `words` array, and what is the maximum length of each individual word (including the target word)?
  4. Are we case-sensitive when comparing strings in the `words` array to the `target` string?
  5. If the target word appears only once in the array, is the shortest distance considered to be 0, since the index is being compared to itself?

Brute Force Solution

Approach

We are given a list of words and two target words, which might be the same. The brute force approach involves checking every possible pair of locations of the target words to find the shortest distance between them.

Here's how the algorithm would work step-by-step:

  1. First, find every location in the list where the first target word appears.
  2. Then, find every location in the list where the second target word appears.
  3. Now, go through every pair of locations, one from the first target word and one from the second target word.
  4. For each pair of locations, calculate the distance between them.
  5. Keep track of the smallest distance you've seen so far.
  6. After checking all possible pairs, the smallest distance you kept track of is the answer.

Code Implementation

def shortest_word_distance_iii_brute_force(words, word1, word2):
    word1_locations = []
    word2_locations = []

    # Find all locations for the first word
    for index, word in enumerate(words):
        if word == word1:
            word1_locations.append(index)

    # Find all locations for the second word
    for index, word in enumerate(words):
        if word == word2:
            word2_locations.append(index)

    minimum_distance = float('inf')

    # Need to check every possible pair of locations
    for word1_location in word1_locations:
        for word2_location in word2_locations:
            # Handle the case where word1 and word2 are the same
            if word1 == word2 and word1_location == word2_location:
                continue

            distance = abs(word1_location - word2_location)
            minimum_distance = min(minimum_distance, distance)

    return minimum_distance

Big(O) Analysis

Time Complexity
O(n²)Finding the locations of the first target word takes O(n) time, where n is the number of words in the list. Similarly, finding the locations of the second target word also takes O(n) time. After identifying the locations, the brute force approach compares every location of the first word with every location of the second word to find the minimum distance. In the worst case, both target words appear in almost every position, leading to approximately n locations for each target. Therefore, comparing all pairs of locations involves roughly n * n comparisons. The total number of operations thus approximates n * n, simplifying to O(n²).
Space Complexity
O(N)The algorithm creates two lists to store the locations of the first and second target words. In the worst case, both target words appear at every position in the input list. Therefore, both lists can grow up to size N, where N is the number of words in the input list. Since we store at most 2*N indices (simplified to N), the auxiliary space complexity is O(N).

Optimal Solution

Approach

This problem asks for the smallest distance between two specific words in a list of words. What makes it tricky is that the two words might be the same! We'll track the places where we find each word and be careful to avoid counting the distance from a word to itself when the target words are identical.

Here's how the algorithm would work step-by-step:

  1. Go through the list of words one by one, keeping track of where we find each of the words we're interested in.
  2. If the two words we're looking for are different, simply compare each place we find the first word with each place we find the second word, and pick the smallest distance.
  3. If the two words are the same, it's a bit different. Compare each place we find the word with every *other* place we find the same word. Avoid comparing a location with itself. Again, we pick the smallest distance.
  4. By considering all possible pairs of word locations (excluding comparisons to the same location when the words are identical), we can determine the absolute shortest distance between our two target words.

Code Implementation

def shortestWordDistance(words, word1, word2):
    word1_indices = []
    word2_indices = []

    for index, word in enumerate(words):
        if word == word1:
            word1_indices.append(index)
        if word == word2:
            word2_indices.append(index)

    minimum_distance = float('inf')

    # If the words are different, compare all pairs
    if word1 != word2:
        for i in word1_indices:
            for j in word2_indices:
                minimum_distance = min(minimum_distance, abs(i - j))

    # If the words are the same, avoid comparing an index to itself
    else:
        for i in range(len(word1_indices)):
            for j in range(len(word1_indices)):
                if i != j:
                    minimum_distance = min(minimum_distance, abs(word1_indices[i] - word1_indices[j]))

    return minimum_distance

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through the input list of n words once to identify indices of word1 and word2. In the worst-case scenario, where word1 and word2 are the same, or appear frequently, we might end up with a list of indices close to the size of n. When word1 and word2 are distinct, we iterate through each found index of word1 and compare against all found indices of word2, a nested loop. When word1 and word2 are the same, we iterate through the list of the identical word's indices and for each one, we iterate through all other indices of the same word. This requires comparing approximately n * n/2 index pairs. Thus, the time complexity is O(n²).
Space Complexity
O(N)The algorithm uses two lists (or arrays) to store the indices of word1 and word2 respectively. In the worst-case scenario, where all words in the input array words are either word1 or word2, each of these lists could grow to the size of the input array, which we denote as N. Therefore, the auxiliary space used is proportional to the input size N to store these indices. This gives us a space complexity of O(N).

Edge Cases

CaseHow to Handle
words is null or emptyReturn Integer.MAX_VALUE, as no distance can be calculated.
target is null or emptyReturn Integer.MAX_VALUE, since no occurrences can be found.
words array has only one element and that element is targetReturn 0, as the shortest distance between the target word and itself is zero.
target does not appear in wordsReturn Integer.MAX_VALUE, indicating no valid distance.
words array contains only the target wordReturn 0, as the distance between any two occurrences will be 0.
Integer overflow when calculating the distance (large indices)Use Math.abs() and ensure the distance is stored in a long to prevent overflow.
Many occurrences of the target word, leading to many distance calculationsThe iterative approach with two pointers efficiently handles a large number of target word occurrences.
The only occurrences of the target are adjacent to each otherThe min calculation correctly handles adjacent indices, resulting in a distance of 1.