Latest web development tutorials

Perl regular expressions

Regex (regular expression) describes a set of strings that can be used to check whether a string contains a certain substring matching substring do replace or remove a condition matches a substring from a string Wait.

Perl regex function of language is very powerful, basically the most powerful common language, many language design regular expression support when they are reference Perl regular expressions.

Three forms of Perl regular expressions are matched, the replacement and conversion:

  • Match: m / / (Also abbreviated as / / Omitted m)

  • Replace: s / / /

  • Conversion: tr / / /

These three forms and are generally= ~ or! ~With the use = ~ represents match! ~ Said they did not match.

Match operator

Match operator m // used to match a string or a regular expression statement, for example, to match the scalar $ bar in the "run", code as follows:


$bar = "I am w3big site. welcome to w3big site.";
if ($bar =~ /run/){
   print "第一次匹配\n";
   print "第一次不匹配\n";

$bar = "run";
if ($bar =~ /run/){
   print "第二次匹配\n";
   print "第二次不匹配\n";

The above program, the output is:


Pattern matching modifiers

There are some common pattern matching modifiers, as follows:

Modifiers description
i Ignore capitalization mode
m Multi-line mode
o Only assigned once
s Single-line mode, "." Match "\ n" (the default does not match)
x Ignore blank mode
g Global Match
cg After the failure of global matching, allows to find a matching string again

Regex variables

Perl will exist after processing to match the value of the three special variable name:

  • The front part of the string matchingportion: $ `
  • $ &: Matching string
  • $ ': Not a match for the remaining strings

If these three variables together, you will get the original string.

Examples are as follows:


$string = "welcome to w3big site.";
$string =~ m/run/;
print "匹配前的字符串: $`\n";
print "匹配的字符串: $&\n";
print "匹配后的字符串: $'\n";

The above program output is:

匹配前的字符串: welcome to 
匹配的字符串: run
匹配后的字符串: oob site.

Substitution operators

Substitution operator s /// operator is to match the expansion of the use of the new string to replace the specified string. The basic format is as follows:


PATTERN to match the pattern, REPLACEMENT to replace the string.

For example, "google" We will replace the following string is "w3big":


$string = "welcome to google site.";
$string =~ s/google/w3big/;

print "$string\n";

The above program output is:

welcome to w3big site.

Replacement operation modifier

Replace operation modifier as follows:

Modifiers description
i If you add "i" in the modifier, then the regular case sensitivity will be canceled, the "a" and "A" is the same.
m The default start regular "^" and end "$" is just for regular string if coupled with an "m" modifier, then the beginning and end of each line will refer to the string: At the beginning of each line is "^", the end is "$."
o Expression is executed only once.
s If you add "s" in the modifier, then the default. "" On behalf of any character except newline will become any character, including newline is!
x If you add the modifier, expression blank characters will be ignored, unless it has been escaped.
g Replace all matching strings.
e As a replacement string expression

Conversion Operators

The following is a conversion operator associated modifiers:

Modifiers description
c Conversion of all characters not specified
d Delete all specified characters
s The same output shrunk to a plurality of character

The following examples the variable $ string in all lowercase letters converted to uppercase:


$string = 'welcome to w3big site.';
$string =~ tr/a-z/A-z/;

print "$string\n";

The above program output is:


The following example uses / s variable $ string repeating characters deleted:


$string = 'w3big';
$string =~ tr/a-z/a-z/s;

print "$string\n";

The above program output is:


More examples:

$string =~ tr/\d/ /c;     # 把所有非数字字符替换为空格
$string =~ tr/\t //d;     # 删除tab和空格
$string =~ tr/0-9/ /cs    # 把数字间的其它字符替换为一个空格。

More regular expression rules

expression description
. Match all characters except newline
x? Match 0 or a string x
x * Match 0 or more times x string, but the minimum number of possible matches
x + Match 1 or more times x string, but the minimum number of possible matches
. * Match 0 or more times any character
. + Match 1 or more times by any character
{M} Exactly match the specified string of m
{M, n} Matches m or more or less specified string n
{M,} Match m or more specified string
[] Which matches a character within []
[^] Match does not meet the characters in []
[0-9] Match all numeric characters
[Az] Match all lowercase alphabetic characters
[^ 0-9] Match all non-numeric characters
[^ Az] Matches any lowercase alphabetic characters
^ A character that matches the beginning
$ Matches the end character of the character
\ D Matches a digit character, and [0-9] the same syntax
\ D + Match multiple numeric string, and [0-9] + syntax as
\ D Non-digital, the other with \ d
\ D + Non-digital, the other with \ d +
\ W A string of letters or numbers, and [a-zA-Z0-9] syntax as
\ W + And [a-zA-Z0-9] + syntax as
\ W Non-English letters or numbers, strings, and [^ a-zA-Z0-9] the same syntax
\ W + And [^ a-zA-Z0-9] + syntax as
\ S Spaces, and [\ n \ t \ r \ f] the same syntax
\ S + And [\ n \ t \ r \ f] + as
\ S Non-space, and [^ \ n \ t \ r \ f] the same syntax
\ S + And [^ \ n \ t \ r \ f] + syntax as
\ B English letters, numbers boundary string matching
\ B Mismatched in alphabetical, numerical boundary string
a | b | c In line with a character string matching the character or b or c character
abc Abc matching string (pattern) contain () will remember this symbol to find the string, it is a very useful syntax. The first string (within) found this variable becomes $ 1 or \ 1 variable, and the second (inner) found a string variable into a $ 2 or \ 2 variable, and so on down.
/ Pattern / i i This parameter indicates ignore case in English, that is, when the match string, ignoring case question in English. \ If you want to find a special character in pattern mode, such as "*", will have to add in front of the character on \ symbol, this will allow special characters fail