|
Starting with this section, we will be discussing Perl's set of extended patterns.
These are extensions to the traditional regular expression syntax that provide powerful new
tools for pattern matching. We have already seen extensions in the form of the minimal matching
constructs ??, *?, +?, {n,m}?, and {n,}?.
The rest of the extensions below have the form (?char...), where the char
is a character that determines the type of extension.
The first extension is an embedded comment (?#text). This embeds a comment into
the regular expression without affecting its meaning. The comment should not have any closing
parentheses in the text. An example is
/(?# Match an integer:)[+-]?\d+/;
|
|
This style of commenting has been largely superseded by the raw, freeform commenting that is
allowed with the //x modifier.
The modifiers //i, //m, //s, and //x can
also embedded in a regexp using (?i), (?m), (?s), and (?x).
For instance,
/(?i)yes/; # match 'yes' case insensitively
/yes/i; # same thing
/(?x)( # freeform version of an integer regexp
[+-]? # match an optional sign
\d+ # match a sequence of digits
)
/x;
|
|
Embedded modifiers can have two important advantages over the usual modifiers. Embedded
modifiers allow a custom set of modifiers to each regexp pattern. This is great for
matching an array of regexps that must have different modifiers:
$pattern[0] = '(?i)doctor';
$pattern[1] = 'Johnson';
...
while (<>) {
foreach $patt (@pattern) {
print if /$patt/;
}
}
|
|
The second advantage is that embedded modifiers only affect the regexp inside the group the
embedded modifier is contained in. So grouping can be used to localize the modifier's effects:
/Answer: ((?i)yes)/; # matches 'Answer: yes', 'Answer: YES', etc.
|
|
Embedded modifiers can also turn off any modifiers already present by using, e.g., (?-i).
Modifiers can also be combined into a single expression, e.g., (?s-i) turns on
single line mode and turns off case insensitivity.
|
|