Online Courses
- Complete Python Bootcamp: Go from zero to hero in Python
-
Complete Python Web Course: Build 8 Python Web Apps
Regular Expressions
Regular Expressions may be used to test if a string (Text) is according to a certain grammar. Sometimes Regular Expressions are called regex or regexp. We can easily test if a string contains another string:
0 1 2 3 4 |
>>> s = "Are you afraid of ghosts?" >>> "ghosts" in s True |
You can also test if a string does not contain a substring:
0 1 2 3 4 |
>>> "coffee" not in s True >>> |
Grammar
We may define a grammar, to match against any input string. Let’s say we want to match against three digits, we would define a gramamr ‘\d{3}\Z’. Example:
0 1 2 3 4 5 6 7 8 9 10 11 |
#!/usr/bin/python import re s = "123" matcher = re.match('\d{3}\Z',s) if matcher: print("True") else: print("False") |
This will out put “True” if the string s matches the grammar string.
Grammar rules
The permitted grammar for regular expressions is:
0 1 2 3 4 5 6 7 8 9 10 |
\d Matches a decimal digit; equivalent to the set [0-9]. \D The complement of \d. It matches any non-digit character; equivalent to the set [^0-9]. \s Matches any whitespace character; equivalent to [ \t\n\r\f\v]. \S The complement of \s. It matches any non-whitespace character; equiv. to [^ \t\n\r\f\v]. \w Matches any alphanumeric character; equivalent to [a-zA-Z0-9_]. \W Matches the complement of \w. \b Matches the empty string, but only at the start or end of a word. \B Matches the empty string, but not at the start or end of a word. \\ Matches a literal backslash. |
2 Comments
What does the \Z do in matcher = re.match(‘\d{3}\Z’,s)
It matches the end of string. If a newline exists, it matches before newline.