LeetCode - Hard - 10. Regular Expression Matching
Topic
- String
- Dynamic Programming
- Backtracking
Description
https://leetcode.com/problems/regular-expression-matching/
Given an input string (s
) and a pattern (p
), implement regular expression matching with support for '.'
and '*'
where:
'.'
Matches any single character.'*'
Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Example 1:
Input: s = "aa", p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input: s = "aa", p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input: s = "ab", p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input: s = "aab", p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".
Example 5:
Input: s = "mississippi", p = "mis*is*p*."
Output: false
Constraints:
- 0 <= s.length <= 20
- 0 <= p.length <= 30
s
contains only lowercase English letters.p
contains only lowercase English letters,'.'
, and'*'
.- It is guaranteed for each appearance of the character
'*'
, there will be a previous valid character to match.
Analysis
方法一:动态规划
Consider following example
s='aab', p='c*a*b'
c * a * b
0 1 2 3 4 5
0 y
a 1
a 2
b 3
dp[i][j]
denotes if s.substring(0,i)
is valid for pattern p.substring(0,j)
. For example dp[0][0] == true
(denoted by y in the matrix) because when s and p are both empty they match. So if we somehow base dp[i+1][j+1]
on previos dp[i][j]
‘s then the result will be dp[s.length()][p.length()]
So what about the first column? for and empty pattern p=""
only thing that is valid is an empty string s=""
and that is already our dp[0][0]
which is true. That means rest of dp[i][0]
is false.
s='aab', p='c*a*b'
c * a * b
0 1 2 3 4 5
0 y
a 1 n
a 2 n
b 3 n
What about the first row? In other words which pattern p matches empty string s=""
? The answer is either an empty pattern p=""
or a pattern that can represent an empty string such as p="a*"
, p="z*"
or more interestingly a combiation of them as in p="a*b*c*"
. Below for loop is used to populate dp[0][j]
. Note how it uses previous states by checking dp[0][j-2]
for (int j=2; j<=p.length(); j++) {
dp[0][j] = p.charAt(j-1) == '*' && dp[0][j-2];
}
At this stage our matrix has become as follows: Notice dp[0][2]
and dp[0][4]
are both true because p="c*"
and p="c*a*"
can both match an empty string.
s='aab', p='c*a*b'
c * a * b
0 1 2 3 4 5
0 y n y n y n
a 1 n
a 2 n
b 3 n
So now we can start our main iteration. It is basically the same, we will iterate all possible s lengths (i) for all possible p lengths (j) and we will try to find a relation based on previous results. Turns out there are two cases.
(p.charAt(j-1) == s.charAt(i-1) || p.charAt(j-1) == '.')
if the current characters match or pattern has . then the result is determined by the previous statedp[i][j] = dp[i-1][j-1]
. Don’t be confused by thecharAt(j-1) charAt(i-1)
indexes using a -1 offset that is because our dp array is actually one index bigger than our string and pattern lenghts to hold the initial statedp[0][0]
.- if
p.charAt(j-1) == '*'
then either it acts as an empty set and the result isdp[i][j] = dp[i][j-2]
or(s.charAt(i-1) == p.charAt(j-2) || p.charAt(j-2) == '.')
current char of string equals the char preceding * in pattern so the result isdp[i-1][j]
.
So here is the final state of matrix after we evaluate all elements:
s='aab', p='c*a*b'
c * a * b
0 1 2 3 4 5
0 y n y n y n
a 1 n n n y y n
a 2 n n n n y n
b 3 n n n n n y
Time and space complexity are O(p.length() * s.length())
.
Try to evaluate the matrix by yourself if it is still confusing,
方法二:递归
There are two cases to consider:
First, the second character of p is *
, now p string can match any number of character before *
. if(isMatch(s, p.substring(2))
means we can match the remaining s string, otherwise, we check if the first character matches or not.
Second, if the second character is not *
, we need match character one by one.
Submission
public class RegularExpressionMatching {
//方法一:动态规划
public boolean isMatch1(String s, String p) {
if (p == null || p.length() == 0)
return (s == null || s.length() == 0);
boolean dp[][] = new boolean[s.length() + 1][p.length() + 1];
dp[0][0] = true;
for (int j = 2; j <= p.length(); j++) {
dp[0][j] = p.charAt(j - 1) == '*' && dp[0][j - 2];
}
for (int j = 1; j <= p.length(); j++) {
for (int i = 1; i <= s.length(); i++) {
if (p.charAt(j - 1) == s.charAt(i - 1) || p.charAt(j - 1) == '.')
dp[i][j] = dp[i - 1][j - 1];
else if (p.charAt(j - 1) == '*')
dp[i][j] = dp[i][j - 2]
|| ((s.charAt(i - 1) == p.charAt(j - 2) || p.charAt(j - 2) == '.') && dp[i - 1][j]);
}
}
return dp[s.length()][p.length()];
}
//方法二:递归
public boolean isMatch2(String s, String p) {
if (p.length() == 0) {
return s.length() == 0;
}
if (p.length() > 1 && p.charAt(1) == '*') { // second char is '*'
if (isMatch2(s, p.substring(2))) {
return true;
}
if (s.length() > 0 && (p.charAt(0) == '.' || s.charAt(0) == p.charAt(0))) {
return isMatch2(s.substring(1), p);
}
return false;
} else { // second char is not '*'
if (s.length() > 0 && (p.charAt(0) == '.' || s.charAt(0) == p.charAt(0))) {
return isMatch2(s.substring(1), p.substring(1));
}
return false;
}
}
}
Test
import static org.junit.Assert.*;
import org.junit.Test;
public class RegularExpressionMatchingTest {
@Test
public void test() {
RegularExpressionMatching obj = new RegularExpressionMatching();
assertFalse(obj.isMatch1("aa", "a"));
assertTrue(obj.isMatch1("aa", "a*"));
assertTrue(obj.isMatch1("ab", ".*"));
assertTrue(obj.isMatch1("aab", "c*a*b"));
assertFalse(obj.isMatch1("mississippi", "mis*is*p*."));
assertFalse(obj.isMatch2("aa", "a"));
assertTrue(obj.isMatch2("aa", "a*"));
assertTrue(obj.isMatch2("ab", ".*"));
assertTrue(obj.isMatch2("aab", "c*a*b"));
assertFalse(obj.isMatch2("mississippi", "mis*is*p*."));
}
}
还没有评论,来说两句吧...