Latest web development tutorials

Regular Expressions - Examples

Simple expressions

The simplest form of a regular expression that matches its own ordinary single character in the search string. For example, single-character mode, such as A, whether in the search string where it always matches the letter A. The following is an example of a regular expression pattern of some of the single characters:

/a/
/7/
/M/

It can be combined together to form a single-character of many great expressions. For example, the following regular expression is a combination of single-character expressions: a, 7 and M.

/a7M/

Note that no concatenation operator. Simply type another character behind a character.

Character matches

Dot (.) Matches the string in a variety of print and non-printing characters, only one character exception. The exception is the newline character (\ n). The following regular expression matches aac, abc, acc, adc, etc., as well as a1c, a2c, ac, and a # c:

/a.c/

To match a string containing the file name, and the period (.) Is an integral part of the input string, in front of the regular expression in the period plus the backslash (\) character. To illustrate, the following regular expression matches filename.ext:

/filename\.ext/

These expressions only allow you to match "any" single character. The list may need to match specific character set. For example, you may want to find chapter headings with numerals (Chapter 1, Chapter 2, and so on).

Expressions in parentheses

To create a list of matching character group, in square brackets ([and]) placing one or more individual characters within. When the character enclosed in brackets, the list called "bracket expression." As in any other location, ordinary characters represent themselves within brackets, that is, it matches one of its own in the input text. Lose their meaning when you appear in the bracket expression most special characters. But there are some exceptions, such as:

  • If] character is not the first, it is the end of a list. To match the list] character, put it in first place, followed at the beginning of [the back.
  • \ Character continues to be an escape character. To match the \ character, use \\.

Enclosed in a bracket expression match only a character in a regular expression in a single character at that position. The following regular expression matching Chapter 1, Chapter 2, Chapter 3, Chapter 4 and Chapter 5:

/Chapter [12345]/

Please note that the location of the space after the word Chapter and with respect to the character within the brackets are fixed. Bracket expression specifies only match a single character position immediately following the word Chapter and the space behind the character set. This is the ninth character position.

To use a range instead of the characters themselves to represent matching character set, use a hyphen (-) in the range of the start character and end character separately. Character value to determine the relative order of individual characters within the range. The following regular expression contains a range expression, the expression is equivalent to the range shown above in parentheses list.

/Chapter [1-5]/

When used in this manner specified range, both the start and end values ​​are included in the scope. Note It is also important, according to the Unicode sort order, the value must begin at the end of the previous values.

To include in the bracket expression hyphen, use one of the following methods:

  • With the backslash escape it:
    [\-]
  • The hyphen at the beginning or end of the list in parentheses. The following expressions match all lowercase letters and hyphens:
    [-a-z]
    [a-z-]
    
  • Create a range, in this range, the start value is less than the hyphen character and the end character is equal to or greater than the hyphen. The following two regular expressions satisfy this requirement:
    [!--]
    [!-~]
    

To find a list or not within the scope of all the characters, set the caret (^) at the beginning of the list. If any other location in the list to insert characters appear, then it matches itself. The following regular expression matches any character, 4 or 5 digits and beyond:

/Chapter [^12345]/

In the above example, the expression matches any 3, 4 or 5 numbers and characters other than in the ninth position. Thus, for example, Chapter 7 is a match, Chapter 9 is a match.

The above expression can use a hyphen (-) to indicate:

/Chapter [^1-5]/

A typical use of a bracket expression is to specify any uppercase or lowercase letters or any number of matches. The following expression specifies such a match:

/[A-Za-z0-9]/

Substitutions and grouping

Replace use | character to allow communication between two or more alternatives to choose. For example, chapter titles can be extended regular expressions to return a wider range than the chapter title matches. However, this is not as simple as you might think. Replace Match | character maximum expression of any one side.

You might think that the following expression matches appears at the beginning and end of the line, followed by one or two digits of the Chapter or Section:

/^Chapter|Section [1-9][0-9]{0,1}$/

Unfortunately, the above regular expression matching lines either the first word Chapter, or matching end of the line with the word Section and any subsequent numbers. If the input string is Chapter 22, then the above expression only matches the word Chapter. If the input string is Section 22, then the expression matches Section 22.

To make regular expressions are easier to control, you can use parentheses to limit the scope of replacement, namely, to ensure that it applies only to the two words Chapter and Section. However, parentheses are used to create a sub-expression, and possibly capture them for later use, which is described in the section about reverse references. By the above regular expressions to add parentheses place, it can make the regular expression matching Chapter 1 or Section 3.

The following regular expression uses parentheses to group Chapter and Section, in order to function properly expression:

/^(Chapter|Section) [1-9][0-9]{0,1}$/

Although these expressions work, but Chapter | Section brackets around will catch either of the two words in a match for later use. Since the above expression is only one set of parentheses, so that only captured a "sub-matches."

In the above example, you only need to use parentheses to select a combination of words between the Chapter and Section. To prevent matches to be saved for future use, before the regular expression pattern is placed in parentheses?:. The following modification provides the same capability without saving the child matches:

/^(?:Chapter|Section) [1-9][0-9]{0,1}$/

In addition:? Exon character, the other two are non-capturing metacharacters create something called "lookahead" match. Forward lookahead use? = Specified, it matches in brackets match the regular expression pattern of the starting point of the search string. Reverse lookahead use?! Specified, it matches in the regular expression pattern does not match the string starting search string.

For example, suppose you have a document that contains links to Windows 3.1, Windows 95, Windows 98 and Windows NT references of. Assume further that you need to update the document will point to Windows 95, Windows 98 and Windows NT all the references changed to Windows 2000. The following regular expression (this is the first example of a forward prediction) match Windows 95, Windows 98 and Windows NT:

/Windows(?=95 |98 |NT )/

Find a match, the match will immediately search for the next match at the text (not including lookahead characters) after. For example, if the above expression matched Windows 98, will instead continue the search after 98 after Windows.

Other examples

Here are some examples of regular expressions:

正则表达式 描述
/\b([a-z]+) \1\b/gi 一个单词连续出现的位置。
/(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)/ 将一个URL解析为协议、域、端口及相对路径。
/^(?:Chapter|Section) [1-9][0-9]{0,1}$/ 定位章节的位置。
/[-a-z]/ a至z共26个字母再加一个-号。
/ter\b/ 可匹配chapter,而不能匹配terminal。
/\Bapt/ 可匹配chapter,而不能匹配aptitude。
/Windows(?=95 |98 |NT )/ 可匹配Windows95或Windows98或WindowsNT,当找到一个匹配后,从Windows后面开始进行下一次的检索匹配。
/^\s*$/ 匹配空行。
/\d{2}-\d{5}/ 验证由两位数字、一个连字符再加 5 位数字组成的 ID 号。
/<\s*(\S+)(\s[^>]*)?>[\s\S]*<\s*\/\1\s*>/ 匹配 HTML 标记。