Regular expressions

Regular Expressions may be used to test if a string (Text) is according to a certain grammar. Sometimes Regular Expressions are called regex or regexp.  We can easily test if a string contains another string:

>>> s = "Are you afraid of ghosts?"
>>> "ghosts" in s
True

You can also test if a string does not contain a substring:

>>> "coffee" not in s
True
>>> 

Grammar
We may define a grammar, to match against any input string. Let’s say we want to match against three digits, we would define a gramamr ‘\d{3}\Z’. Example:

#!/usr/bin/python
import re

s = "123"
matcher = re.match('\d{3}\Z',s)

if matcher:
    print("True")
else:
    print("False")

This will out put “True” if the string s matches the grammar string.

Grammar rules
The permitted grammar for regular expressions is:

\d 	Matches a decimal digit; equivalent to the set [0-9].
\D 	The complement of \d. It matches any non-digit character; equivalent to the set [^0-9].
\s 	Matches any whitespace character; equivalent to [ \t\n\r\f\v].
\S 	The complement of \s. It matches any non-whitespace character; equiv. to [^ \t\n\r\f\v].
\w 	Matches any alphanumeric character; equivalent to [a-zA-Z0-9_]. 
\W 	Matches the complement of \w.
\b 	Matches the empty string, but only at the start or end of a word.
\B 	Matches the empty string, but not at the start or end of a word.
\\ 	Matches a literal backslash.

Logging
Statistics

2 thoughts on “Regular expressions

  1. Reply
    Omar - January 7, 2016

    What does the \Z do in matcher = re.match(‘\d{3}\Z’,s)

    1. Reply
      admin - January 9, 2016

      It matches the end of string. If a newline exists, it matches before newline.

Leave a Reply

Your email address will not be published. Required fields are marked *