Given two strings str1
and str2
, return the shortest string that has both str1
and str2
as subsequences. If there are multiple valid strings, return any of them.
A string s
is a subsequence of string t
if deleting some number of characters from t
(possibly 0
) results in the string s
.
For example:
str1
= "abac", str2
= "cab" should return "cabac"str1
= "aaaaaaaa", str2
= "aaaaaaaa" should return "aaaaaaaa"Your solution should be efficient in terms of time and space complexity. Consider edge cases such as:
Explain your approach, its time complexity, and its space complexity.
Given two strings, str1
and str2
, the goal is to find the shortest string that has both str1
and str2
as subsequences. If multiple such strings exist, any one of them can be returned.
A straightforward, though inefficient, approach involves generating all possible supersequences of str1
and str2
and then identifying the shortest among them that contains both input strings as subsequences. This can be achieved recursively by considering all possible interleavings of the characters of the two strings. The time complexity of this approach is exponential, rendering it impractical for larger inputs.
The problem can be efficiently solved using dynamic programming. The core idea is to find the Longest Common Subsequence (LCS) of str1
and str2
. Once the LCS is known, the shortest common supersequence can be constructed by including all characters of str1
and str2
, but including the characters of LCS only once.
str1
and str2
.str1
, str2
, and their LCS.A 2D array dp
is created where dp[i][j]
stores the length of the LCS of str1[0...i-1]
and str2[0...j-1]
. The DP table is filled as follows:
str1[i-1] == str2[j-1]
, then dp[i][j] = dp[i-1][j-1] + 1
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
Starting from the bottom-right cell of the dp
table (i.e., dp[n][m]
, where n
and m
are the lengths of str1
and str2
respectively), the SCS is constructed by tracing back through the table:
str1[i-1] == str2[j-1]
, then this character is part of the LCS, so it is included in the SCS, and i
and j
are decremented.dp[i-1][j] > dp[i][j-1]
, then str1[i-1]
is included in the SCS, and i
is decremented.str2[j-1]
is included in the SCS, and j
is decremented.str1
or str2
are appended to the SCS.str1
= "abac", str2
= "cab"
def shortestCommonSupersequence(str1: str, str2: str) -> str:
n, m = len(str1), len(str2)
# Compute LCS
dp = [[''] * (m + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, m + 1):
if str1[i - 1] == str2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + str1[i - 1]
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1], key=len)
lcs = dp[n][m]
# Construct SCS
i, j = 0, 0
k = 0
result = ""
while i < n or j < m:
if k < len(lcs) and i < n and j < m and str1[i] == lcs[k] and str2[j] == lcs[k]:
result += str1[i]
i += 1
j += 1
k += 1
elif i < n and (k >= len(lcs) or j == m or str1[i] != lcs[k]):
result += str1[i]
i += 1
elif j < m and (k >= len(lcs) or i == n or str2[j] != lcs[k]):
result += str2[j]
j += 1
return result
m
and n
are the lengths of str1
and str2
, respectively, due to the dynamic programming approach for computing the LCS.