Design a data structure that will be initialized with a list of words. The words provided can be preprocessed, allowing for repeated queries to find the shortest distance between two distinct words.
Implement the WordDistance
class:
WordDistance(String[] wordsDict)
initializes the object with the list of words wordsDict
.int shortest(String word1, String word2)
returns the shortest distance between word1
and word2
in the list.Example 1:
Input ["WordDistance", "shortest", "shortest"] [[["practice", "makes", "perfect", "coding", "makes"]], ["coding", "practice"], ["makes", "coding"]] Output [null, 3, 1] Explanation WordDistance wordDistance = new WordDistance(["practice", "makes", "perfect", "coding", "makes"]); wordDistance.shortest("coding", "practice"); // return 3 wordDistance.shortest("makes", "coding"); // return 1
Constraints:
1 <= wordsDict.length <= 3 * 104
1 <= wordsDict[i].length <= 10
wordsDict[i]
consists of lowercase English letters.word1
and word2
are in wordsDict
.word1 != word2
5000
calls will be made to shortest
.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
We are given a list of words and we need to find the shortest distance between two specific words that appear in the list. The brute force method involves checking every possible pair of occurrences of the two words.
Here's how the algorithm would work step-by-step:
def shortest_word_distance_brute_force(words, word1, word2):
word1_indices = []
word2_indices = []
# Find indices of word1
for index, word in enumerate(words):
if word == word1:
word1_indices.append(index)
# Find indices of word2
for index, word in enumerate(words):
if word == word2:
word2_indices.append(index)
minimum_distance = float('inf')
# Iterate through all pairs of word1 and word2 indices
for word1_index in word1_indices:
for word2_index in word2_indices:
#Calculate distance between the two words
distance = abs(word1_index - word2_index)
#Track the shortest distance found
minimum_distance = min(minimum_distance, distance)
return minimum_distance
The trick is to store where each word appears in the text beforehand. Then, when we need to find the shortest distance between two words, we can quickly go through their locations and efficiently find the minimum distance, instead of searching the entire text again and again.
Here's how the algorithm would work step-by-step:
class WordDistance:
def __init__(self, words_list):
self.word_index_map = {}
for index, word in enumerate(words_list):
if word not in self.word_index_map:
self.word_index_map[word] = []
self.word_index_map[word].append(index)
def shortest(self, word1, word2):
# Obtain indices for both words.
word1_indices = self.word_index_map[word1]
word2_indices = self.word_index_map[word2]
minimum_distance = float('inf')
word1_index_pointer = 0
word2_index_pointer = 0
# Iterate to find the shortest distance
while word1_index_pointer < len(word1_indices) and word2_index_pointer < len(word2_indices):
index1 = word1_indices[word1_index_pointer]
index2 = word2_indices[word2_index_pointer]
current_distance = abs(index1 - index2)
minimum_distance = min(minimum_distance, current_distance)
# Move the pointer of the word with the smaller index.
if index1 < index2:
word1_index_pointer += 1
else:
word2_index_pointer += 1
return minimum_distance
Case | How to Handle |
---|---|
wordsDict is null or empty | Throw an IllegalArgumentException or return -1 to indicate invalid input, as no word distances can be calculated. |
word1 or word2 is null or empty | Throw an IllegalArgumentException or return -1, as a valid distance cannot be computed without valid words. |
word1 and word2 are the same word | Find all occurrences of the word and compute the minimum distance between adjacent occurrences; return Integer.MAX_VALUE if less than 2 occurrences. |
wordsDict contains only one unique word | If word1 and word2 are this unique word, find all occurrences and compute minimum distance; otherwise return Integer.MAX_VALUE since the non-existent word won't be found. |
word1 or word2 are not present in wordsDict | Return Integer.MAX_VALUE, indicating that no valid distance exists since one or both words are missing. |
wordsDict contains a very large number of words, causing memory issues when storing indices | Consider using an external data structure or database to store and retrieve indices if the wordsDict is too large to fit in memory. |
Integer overflow when calculating the difference between large indices | Use long to store the distances between indices to prevent potential integer overflows during calculation. |
Very long words in wordsDict, potentially affecting string comparison performance | String comparison using .equals() is efficient, but can be improved with hashing long strings if performance is critical. |