Given two strings text1
and text2
, return the length of their longest common subsequence. If there is no common subsequence, return 0
.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
"ace"
is a subsequence of "abcde"
.A common subsequence of two strings is a subsequence that is common to both strings.
Example 1:
Input: text1 = "abcde", text2 = "ace"
Output: 3
Explanation: The longest common subsequence is "ace" and its length is 3.
Example 2:
Input: text1 = "abc", text2 = "abc"
Output: 3
Explanation: The longest common subsequence is "abc" and its length is 3.
Example 3:
Input: text1 = "abc", text2 = "def"
Output: 0
Explanation: There is no such common subsequence, so the result is 0.
Constraints:
1 <= text1.length, text2.length <= 1000
text1
and text2
consist of only lowercase English characters.A straightforward approach is to use recursion. We can compare the last characters of the two strings. If they match, we increment the length of the LCS and recursively find the LCS of the remaining prefixes. If they don't match, we take the maximum of the LCS obtained by excluding the last character from either text1
or text2
.
Code (Python):
def longestCommonSubsequence_recursive(text1, text2):
def lcs(i, j):
if i == 0 or j == 0:
return 0
if text1[i-1] == text2[j-1]:
return 1 + lcs(i-1, j-1)
else:
return max(lcs(i-1, j), lcs(i, j-1))
return lcs(len(text1), len(text2))
Time Complexity: O(2^(m+n)), where m and n are the lengths of text1
and text2
respectively. This is because, in the worst case, each call branches into two recursive calls.
Space Complexity: O(m+n) due to the recursion depth.
The recursive approach has overlapping subproblems, making it inefficient. Dynamic programming can significantly improve performance by storing the results of subproblems and reusing them.
We can create a 2D table dp
where dp[i][j]
stores the length of the LCS of text1[0...i-1]
and text2[0...j-1]
. The table is filled as follows:
text1[i-1] == text2[j-1]
, then dp[i][j] = dp[i-1][j-1] + 1
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
The base case is dp[i][0] = 0
and dp[0][j] = 0
for all i
and j
.
Code (Python):
def longestCommonSubsequence_dp(text1, text2):
m = len(text1)
n = len(text2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i-1] == text2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[m][n]
Time Complexity: O(m*n), where m and n are the lengths of text1
and text2
respectively. We iterate through the dp
table once.
Space Complexity: O(m*n) to store the dp
table.
text1
or text2
is empty, the LCS is 0. The DP solution handles this case correctly because the dp
table is initialized with 0s.text1
and text2
have no characters in common, the LCS is 0. The DP solution correctly computes this because dp[i][j]
will remain 0 for all i
and j
.text1
and text2
are identical, the LCS is equal to the length of either string. The DP solution will correctly compute this.The space complexity can be further optimized to O(min(m, n)) by using only two rows of the DP table at a time, as dp[i][j]
only depends on the current row and the previous row. However, this optimization makes the code slightly more complex.