RegExp

Extends Object. Regular Expression object for matching a String against a pattern. 

EVML Regular Expressions follow the PCRE/Perl pattern and operate similarly to how you might expect in JavaScript. If you are used to regular expression matching in either PHP or JavaScript you should adapt to EVML with ease. 

Instantiation

You can create a RegExp instance via one of the following methods:


<?ev
    // shorthand
    // usage consists of /pattern/flags
    var myRegex = /abc/i;
    
    // object creation
    // utilises the more conventional constructor
    var myRegex = new Regex('abc', 'i');
?>

Construct parameters
pattern String

The regular expression represented as a String. See special characters list below.

flags String

If specified flags affect whether the regular expression is global, case sensitive or over multiple lines. It therefore supports the following flags:

  • i - denotes a case insensitive pattern match
  • m - denotes that the beginning (^) and end ($) delimiters match over multiple lines
  • g - denotes a global match across the entire string and does not stop at the first encounter of a match


Special Characters in Patterns

Regular expressions  in EV Script support the following character classes, sets, boundaries, groups and quantifiers.

Character Classes

. Match any character except newline
\d Matches a digit character in the basic Latin alphabet. Equivalent to [0-9]
\D Matches any character that is not a digit in the basic Latin alphabet. Equivalent to [^0-9]
\w Matches any alphanumeric character from the basic Latin alphabet, including the underscore. Equivalent to [A-Za-z0-9_]
\W Matches any character that is not a word character from the basic Latin alphabet. Equivalent to [^A-Za-z0-9_]
\s Matches a single white space character, including space, tab, form feed, line feed and other Unicode spaces
\S Matchesa single character other than white space
\t Matches a tab
\r Matches a carriage return
\n Matches a linefeed
[\b] Matches a backspace (not to be confused with \b)
\ For characters that are usually treated literally, indicates that the next character is special and not to be interpreted literally

 

Character Sets

[abc] A character set. Matches any one of the enclosed characters. You can specify a range of characters by using a hyphen.
[^abc] Anything that is not enclosed in the brackets. You can specify a range of characters by using a hyphen.

 

Boundaries

^ Matches beginning of input. If the multiline flag is set to true, also matches immediately after a line break character
$ Matches end of input. If the multiline flag is set to true, also matches immediately before a line break character
\b Matches a zero-width word boundary, such as between a letter and a space. (Not to be confused with [\b]
\B Matches a zero-width non-word boundary, such as between two letters or between two spaces

 

Grouping & back references

(x) Matches x and remembers the match. These are called capturing parentheses.
\1, \2 etc. \n where n is a positive integer. A back reference to the last substring matching the n parenthetical in the regular expression (counting left parentheses).
(:x) Matches x but does not remember the match. These are called non-capturing parentheses.

 

Quantifiers

x* Matches the preceding item x 0 or more times
x+ Matches the preceding item x 1 or more times
x+? x*? Matches the preceding item x like * and + from above, however the match is the smallest possible match.
x? Matches the preceding item x 0 or 1 time.
x(?=y) Matches x only if x is followed by y.
x(?!y) Matches x only if x is not followed by y.
x|y Matches either x or y.
x{n} Where n is a positive integer. Matches exactly n occurrences of the preceding item x.
x{n,} Where n is a positive integer. Matches at least n occurrences of the preceding item x.
x{n,m} Where n and m are positive integers. Matches at least n and at most m occurrences of the preceding item x.


Properties
flags String

The flags of the regular expression e.g. "igm".

global Boolean

True if the global flag is present, or false otherwise.

ignoreCase Boolean

True if the ignore case flag is present, or false otherwise.

lastIndex Number

The the index at which to start the next match when using the g (global) flag.

length Number

The length of a regular expression, which is always 2.

multiline Boolean

True if the multiline flag is present, or false otherwise.

source String

The pattern of the regular expression without delimiters or flags.



Methods
exec(String:subject) Array | null

Returns an Array if the subject matches the pattern, or null if not.

  • Array[0]
    The full string of characters matched
  • Array[1]...[n]
    The parenthesized substring/group matches, if any. The number of possible parenthesized substrings is unlimited.

If the global flag is absent (regexp.global is false) then the regexp.lastIndex is updated to the index at which to start the next match. When "g" is absent, this will remain as 0.

If your regular expression uses the "g" flag (regexp.global is true), you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of the subject specified by the regular expression's lastIndex property. For example, assume you have this script:


<?ev
    var myRegex = /ab*/g; // note the 'g' flag
    var subject = 'abbcdefabhihghggjabbba';
    var myArray;
    while ((myArray = myRegex.exec(subject)) !== null) {
?>
<p>
    Found: {{ myArray[0] }}<br />
    Index: {{ myRegex.lastIndex }}
</p>
<?ev } ?>
test(String:subject) Boolean

True if the pattern matches the string, or false if not.