[Contents]
[Previous] [Next] [Last]Regular expressions are patterns used to match character combinations in strings. In
JavaScript, regular expressions are also objects. For example, to search for all
occurrences of 'the' in a string, you create a pattern consisting of 'the' and use the
pattern to search for its match in a string. Regular expression patterns can be
constructed using literal notation (for example, /abc/
) or the RegExp
constructor function (for example, re = new RegExp("abc")
).
These patterns are used with the exec
and test
methods of
regular expressions, and with the match
, replace
, search
,
and split
methods of String
.
For complete information on the objects, properties, and methods used with regular expressions, see:
You construct a regular expression in one of two ways:
re = /ab+c/
Literal notation provides compilation of the regular expression when the script is evaluated. When the regular expression will remain constant use literal notation for better performance.
RegExp
object, as in:re = new RegExp("ab+c")
Using the constructor function provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input. Once you have a defined regular expression, if the regular expression is used throughout the script, and if its source changes, you can use the
compile
method to compile a new regular expression for efficient reuse.
The regular expression object is explained in detail in "
Regular Expression Objects."A regular expression pattern is composed of simple characters, such as /abc/
,
or a combination of simple and special characters, such as /ab*c/
or /Chapter (\d+)\.\d*/
.
The last example includes parentheses which are used as a memory device. The match made
with this part of the pattern is remembered for later use, as described in
Simple patterns are constructed of characters for which you want to find a direct
match. For example, the pattern /abc/
matches character combinations in
strings only when exactly the characters 'abc' occur together and in that order. Such a
match would succeed in the strings "Hi, do you know your abc's?" and "The
latest airplane designs evolved from slabcraft." In both cases the match is with the
substring 'abc'. There is no match in the string "Grab crab" because it does not
contain the substring 'abc'.
When the search for a match requires something more than a direct match, such as
finding one or more b's, or finding whitespace, the pattern includes special characters.
For example, the pattern /ab*c/
matches any character combination in which a
single 'a' is followed by zero or more 'b's (*
means 0 or more occurrences of
the preceding character) and then immediately followed by 'c'. In the string
"cbbabbbbcdebc," the pattern matches the substring 'abbbbc'.
Table 10.1 Special characters in regular expressions.
Parentheses around any part of the regular expression pattern cause that part of the matched substring to be remembered. Once remembered, the substring can be recalled for other use, as described in
"Using Parenthesized Substring Matches" on page 108.Regular expressions are used with the regular expression methods test
and exec
and with the String
methods match
, replace
, search
,
and split
. These methods are explained in detail at their linked locations.
For information on the predefined RegExp
object and
its properties, see Chapter 11,
"The RegExp Object."
In the following example, the script uses the exec
method to find a match in a string.
<SCRIPT LANGUAGE="JavaScript1.2"> myRe=/d(b+)d/g; myArray = myRe.exec("cdbbdbsbz"); </SCRIPT>
If you do not need to access the properties of the regular expression, an alternative
way of creating myArray
is with this script:
<SCRIPT LANGUAGE="JavaScript1.2"> myArray = /d(b+)d/g.exec("cdbbdbsbz"); </SCRIPT>
If you want to be able to recompile the regular expression, yet another alternative is this script:
<SCRIPT LANGUAGE="JavaScript1.2"> myRe= new RegExp ("d(b+)d", "g:); myArray = myRe.exec("cdbbdbsbz"); </SCRIPT>
With these scripts, the match succeeds and returns the array and updates the properties shown in
Table 10.2.Table 10.2 Results of regular expression execution.
myArray.input.substring(0, myArray.index)
and RegExp.rightContext
is equivalent to:
myArray.input.substring(myArray.index + myArray[0].length)
As shown in the second form of this example, you can use the literal form of a regular expression without assigning it to a variable. If you do, however, every occurrence of the literal is a new regular expression. For this reason, if you use the literal form without assigning it to a variable, you cannot subsequently access the properties of that regular expression. For example, assume you have this script:
<SCRIPT LANGUAGE="JavaScript1.2"> myRe=/d(b+)d/g; myArray = myRe.exec("cdbbdbsbz"); document.writeln("The value of lastIndex is " + myRe.lastIndex); </SCRIPT>
This script displays:
The value of lastIndex is 5
However, if you have this script:
<SCRIPT LANGUAGE="JavaScript1.2"> myArray = /d(b+)d/g.exec("cdbbdbsbz"); document.writeln("The value of lastIndex is " + /d(b+)d/g.lastIndex); </SCRIPT>
It displays:
The value of lastIndex is 0
The occurrences of /d(b+)d/g
in the two statements are different regular
expression objects and hence have different values for their lastIndex
property. If you need to access the properties of a literal regular expression, you should
first assign it to a variable.
Including parentheses in a regular expression pattern causes the corresponding submatch
to be remembered. For example, /a(b)c/
matches the characters 'abc' and
remembers 'b'. To recall these parenthesized substring matches, use the RegExp
properties $1
, ..., $9
or the Array
elements [1]
,
..., [n]
.
RegExp
object holds up to the last nine and the returned array
holds all that were found. The following examples illustrate how to use parenthesized
substring matches.
Example 1. : The following script uses the replace
method to switch the
words in the string. For the replacement text, the script uses the values of the $1
and $2
properties.
<SCRIPT LANGUAGE="JavaScript1.2"> re = /(\w+)\s(\w+)/; str = "John Smith"; newstr = str.replace(re, "$2, $1"); document.write(newstr) </SCRIPT>
This prints "Smith, John".
Example 2. : In the following example,RegExp.input
is set by the Change
event. In the getInfo
function, the exec
method uses the value
of RegExp.input
as its argument. Note that RegExp
must be
prepended to its $
properties (because they appear outside the replacement
string). (Example 3 is a more efficient, though possibly more cryptic, way to accomplish
the same thing.)
<HTML>
<SCRIPT LANGUAGE="JavaScript1.2"> function getInfo(){ re = /(\w+)\s(\d+)/ re.exec(); window.alert(RegExp.$1 + ", your age is " + RegExp.$2); } </SCRIPT>
Enter your first name and your age, and then press Enter.
<FORM> <INPUT TYPE:"TEXT" NAME="NameAge" onChange="getInfo(this);"> </FORM>
</HTML>
Example 3. :
The following example is similar to Example 2. Instead of using theRegExp.$1
and RegExp.$2
, this example
creates an array and uses a[1]
and a[2]
. It also uses the
shortcut notation for using the exec
method.
<HTML>
<SCRIPT LANGUAGE="JavaScript1.2"> function getInfo(){ a = /(\w+)\s(\d+)/(); window.alert(a[1] + ", your age is " + a[2]); } </SCRIPT>
Enter your first name and your age, and then press Enter.
<FORM> <INPUT TYPE:"TEXT" NAME="NameAge" onChange="getInfo(this);"> </FORM>
</HTML>
Regular expressions have two optional flags that allow for global and case insensitive
searching. To indicate a global search, use the g
flag. To indicate a case
insensitive search, use the i
flag. These flags can be used separately or
together in either order, and are included as part of the regular expression.
re = /pattern/[g|i|gi] re = new RegExp("pattern", ['g'|'i'|'gi'])
Note that the flags, i
and g
, are an integral part of a
regular expression. They cannot be added or removed later.
re = /\w+\s/g
creates a
regular expression that looks for one or more characters followed by a space, and it looks
for this combination throughout the string.
<SCRIPT LANGUAGE="JavaScript1.2"> re = /\w+\s/g; str = "fee fi fo fum"; myArray = str.match(re); document.write(myArray); </SCRIPT>
This displays ["fee ", "fi ", "fo "]. In this example, you could replace the line:
re = /\w+\s/g;
with:
re = new RegExp("\\w+\\s", "g");
and get the same result.
The following example illustrates the formation of regular expressions and the use of string.split()
and string.replace()
.
<SCRIPT LANGUAGE="JavaScript1.2">
// The name string contains multiple spaces and tabs, // and may have multiple spaces between first and last names. names = new String ( "Harry Trump ;Fred Barney; Helen Rigby ;\ Bill Abel ;Chris Hand ")
document.write ("---------- Original String" + "<BR>" + "<BR>") document.write (names + "<BR>" + "<BR>")
// Prepare two regular expression patterns and array storage. // Split the string into array elements.
// pattern: possible white space then semicolon then possible white space pattern = /\s*;\s*/
// Break the string into pieces separated by the pattern above and // and store the pieces in an array called nameList nameList = names.split (pattern)
// new pattern: one or more characters then spaces then characters. // Use parentheses to "memorize" portions of the pattern. // The memorized portions are referred to later. pattern = /(\w+)\s+(\w+)/
// New array for holding names being processed. bySurnameList = new Array;
// Display the name array and populate the new array // with comma-separated names, last first. // // The replace method removes anything matching the pattern // and replaces it with the memorized string--second memorized portion // followed by comma space followed by first memorized portion. // // The variables $1 and $2 refer to the portions // memorized while matching the pattern.
document.write ("---------- After Split by Regular Expression" + "<BR>") for ( i = 0; i < nameList.length; i++) { document.write (nameList[i] + "<BR>") bySurnameList[i] = nameList[i].replace (pattern, "$2, $1") }
// Display the new array. document.write ("---------- Names Reversed" + "<BR>") for ( i = 0; i < bySurnameList.length; i++) { document.write (bySurnameList[i] + "<BR>") }
// Sort by last name, then display the sorted array. bySurnameList.sort() document.write ("---------- Sorted" + "<BR>") for ( i = 0; i < bySurnameList.length; i++) { document.write (bySurnameList[i] + "<BR>") }
document.write ("---------- End" + "<BR>")
</SCRIPT>
In the following example, a user enters a phone number. When the user presses Enter, the script checks the validity of the number. If the number is valid (matches the character sequence specified by the regular expression), the script posts a window thanking the user and confirming the number. If the number is invalid, the script posts a window telling the user that the phone number isn't valid.
The regular expression looks for zero or one open parenthesis\(?
,
followed by three digits \d{3}
, followed by zero or one close parenthesis \)?
,
followed by one dash, forward slash, or decimal point and when found, remember the
character ([-\/\.])
, followed by three digits \d{3}
, followed by
the remembered match of a dash, forward slash, or decimal point \1
, followed
by four digits \d{4}
.
The Change event activated when the user presses Enter, sets the
value of RegExp.input
.
<HTML> <SCRIPT LANGUAGE = "JavaScript1.2">
re = /\(?\d{3}\)?([-\/\.])\d{3}\1\d{4}/
function testInfo() { OK = re.exec() if (!OK) window.alert (RegExp.input + " isn't a phone number with area code!") else window.alert ("Thanks, your phone number is " + OK[0]) }
</SCRIPT>
Enter your phone number (with area code) and then press Enter. <FORM> <INPUT TYPE:"TEXT" NAME="Phone" onChange="testInfo(this);"> </FORM>
</HTML>
[Contents]
[Previous] [Next] [Last]Last Updated: 10/22/97 11:48:12
Copyright © 1997 Netscape Communications Corporation