Given two strings word1
and word2
, return the minimum number of operations required to convert word1
to word2
.
You have the following three operations permitted on a word:
Example 1:
Input: word1 = "horse", word2 = "ros" Output: 3 Explanation: horse -> rorse (replace 'h' with 'r') rorse -> rose (remove 'r') rose -> ros (remove 'e')
Example 2:
Input: word1 = "intention", word2 = "execution" Output: 5 Explanation: intention -> inention (remove 't') inention -> enention (replace 'i' with 'e') enention -> exention (replace 'n' with 'x') exention -> exection (replace 'n' with 'c') exection -> execution (insert 'u')
Constraints:
0 <= word1.length, word2.length <= 500
word1
and word2
consist of lowercase English letters.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The edit distance problem asks how many changes (insertions, deletions, substitutions) are needed to make two words identical. The brute force approach explores every single possible combination of edits to transform one word into the other, ensuring no possibility is missed.
Here's how the algorithm would work step-by-step:
def edit_distance_brute_force(first_word, second_word):
def calculate_distance(first_word_index, second_word_index):
# If we've reached the end of the first word, insert remaining chars of second word
if first_word_index == len(first_word):
return len(second_word) - second_word_index
# If we've reached the end of the second word, delete remaining chars of first word
if second_word_index == len(second_word):
return len(first_word) - first_word_index
# If the characters match, no operation needed, move to next characters
if first_word[first_word_index] == second_word[second_word_index]:
return calculate_distance(first_word_index + 1, second_word_index + 1)
# Explore all possible operations and take the minimum edit distance
insert_operation = 1 + calculate_distance(first_word_index, second_word_index + 1)
delete_operation = 1 + calculate_distance(first_word_index + 1, second_word_index)
# Consider substitution operation to match characters
substitution_operation = 1 + calculate_distance(first_word_index + 1, second_word_index + 1)
return min(insert_operation, delete_operation, substitution_operation)
return calculate_distance(0, 0)
The edit distance problem asks us to find the minimum number of changes needed to make two words identical. The clever approach avoids trying all possible combinations of edits by building up solutions for smaller parts of the problem and combining them to solve the bigger problem.
Here's how the algorithm would work step-by-step:
def edit_distance(word1, word2):
word1_length = len(word1)
word2_length = len(word2)
# Initialize a matrix to store edit distances between prefixes of the words.
distance_matrix = [[0] * (word2_length + 1) for _ in range(word1_length + 1)]
# Initialize the first row and column of the matrix.
for i in range(word1_length + 1):
distance_matrix[i][0] = i
for j in range(word2_length + 1):
distance_matrix[0][j] = j
for i in range(1, word1_length + 1):
for j in range(1, word2_length + 1):
# If the characters are the same, no edit is needed.
if word1[i - 1] == word2[j - 1]:
distance_matrix[i][j] = distance_matrix[i - 1][j - 1]
else:
# Consider insertion, deletion, and substitution.
distance_matrix[i][j] = min(
distance_matrix[i - 1][j] + 1, # Deletion
distance_matrix[i][j - 1] + 1, # Insertion
distance_matrix[i - 1][j - 1] + 1, # Substitution
)
# The bottom-right cell contains the edit distance.
return distance_matrix[word1_length][word2_length]
Case | How to Handle |
---|---|
Both strings are empty | The edit distance is 0, return 0. |
One string is empty, the other is not | The edit distance is the length of the non-empty string, due to insertions or deletions. |
Strings are identical | The edit distance is 0, return 0. |
Strings of length 1 | Handle the base cases correctly in the recursion or dynamic programming initialization. |
Strings with large lengths (close to maximum allowed string length) | Ensure the solution uses dynamic programming to avoid exponential time complexity and possible stack overflow from recursion. |
Strings contain identical characters | Dynamic programming approach accounts for the matching characters by using previous subproblem results. |
Strings that are very different (require many edits) | Dynamic programming will efficiently calculate large edit distances. |
Null input strings | Throw an IllegalArgumentException or return -1 to signify an invalid input. |