Given an input string s
and a pattern p
, implement regular expression matching with support for '.'
and '*'
where:
'.'
Matches any single character.'*'
Matches zero or more of the preceding element.The matching should cover the entire input string (not partial).
Example 1:
Input: s = "aa", p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input: s = "aa", p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input: s = "ab", p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Constraints:
1 <= s.length <= 20
1 <= p.length <= 20
s
contains only lowercase English letters.p
contains only lowercase English letters, '.'
, and '*'
.'*'
, there will be a previous valid character to match.## Regular Expression Matching
This problem requires implementing a regular expression matching algorithm with support for `.` and `*`. The `.` character matches any single character, and the `*` character matches zero or more of the preceding element. The matching should cover the entire input string, not just a partial substring.
### 1. Naive Approach (Recursion)
A straightforward but inefficient approach is to use recursion. The base cases are when either the string or the pattern is empty. If the next character in the pattern is not `*`, we check if the first characters of the string and pattern match (or if the pattern character is `.`). If they do, we recursively call the function with the rest of the string and pattern. If the next character is `*`, we have two options: either we ignore the `*` and the preceding element, or we try to match the preceding element one or more times.
```python
def isMatch_naive(s: str, p: str) -> bool:
if not p:
return not s
first_match = bool(s) and (p[0] == s[0] or p[0] == '.')
if len(p) >= 2 and p[1] == '*':
return (isMatch_naive(s, p[2:]) or # Ignore * and preceding element
(first_match and isMatch_naive(s[1:], p)))
else:
return first_match and isMatch_naive(s[1:], p[1:])
A more efficient approach is to use dynamic programming. We create a 2D table dp
where dp[i][j]
is true if the first i
characters of the string s
match the first j
characters of the pattern p
. The table is built bottom-up, and the final result is stored in dp[len(s)][len(p)]
.
def isMatch(s: str, p: str) -> bool:
n = len(s)
m = len(p)
dp = [[False] * (m + 1) for _ in range(n + 1)]
dp[0][0] = True
# Deals with patterns like a*, a*b*, a*b*c*
for j in range(1, m + 1):
if p[j - 1] == '*' and j > 1:
dp[0][j] = dp[0][j - 2]
for i in range(1, n + 1):
for j in range(1, m + 1):
if p[j - 1] == '.' or p[j - 1] == s[i - 1]:
dp[i][j] = dp[i - 1][j - 1]
elif j > 1 and p[j - 1] == '*':
dp[i][j] = dp[i][j - 2] # zero occurrences
if p[j - 2] == '.' or p[j - 2] == s[i - 1]:
dp[i][j] = dp[i][j] or dp[i - 1][j] # one or more occurrences
else:
dp[i][j] = False
return dp[n][m]
Naive Approach: The time complexity of the recursive approach is exponential, specifically O(2^(m+n)), where n is the length of string s
and m is the length of pattern p
. This is due to the branching nature of the recursion when encountering a *
character. Each *
could potentially lead to two recursive calls.
Dynamic Programming Approach: The time complexity of the dynamic programming approach is O(m*n), where n is the length of string s
and m is the length of pattern p
. This is because we iterate through the dp
table once, which has dimensions (n+1)x(m+1).
Naive Approach: The space complexity of the recursive approach is O(m+n) in the worst case, where n is the length of string s
and m is the length of pattern p
. This space is used by the call stack due to the recursive calls. In worst case the depth of the recursion can be m+n.
Dynamic Programming Approach: The space complexity of the dynamic programming approach is O(m*n), where n is the length of string s
and m is the length of pattern p
. This is because we use a 2D table dp
of dimensions (n+1)x(m+1) to store the intermediate results.
s
and p
are empty, they match. If s
is empty but p
is not, we need to check if p
can match an empty string (e.g., a*b*c*
). If p
is empty but s
is not, they don't match..
Character: The .
character should match any single character in s
.*
Character:** The *
character should match zero or more occurrences of the preceding character. If the preceding character doesn't match the current character in s
, we can ignore the *
and the preceding character.*
: The problem statement guarantees that each *
will have a preceding character, so we don't need to handle cases like **
.*
: The dynamic programming approach efficiently handles long patterns by caching intermediate results.Here are some examples to illustrate the dynamic programming approach:
Example 1: s = "aa", p = "a*"
dp
table:
∅ a *
∅ True False True
a False False True
a False False True
Example 2: s = "ab", p = ".*"
dp
table:
∅ . *
∅ True False True
a False True True
b False True True
By constructing the dp
table, we can determine if s
matches p
.