Regular Expression

http://www.mygrep.com/

Regular expressions

Mygrep is an easy to use and powerfull tool for template-based checking of user input string fields (in DBMS and web-based applications), text search/substitutions, egrep-like utils & etc.

You can easily check e-mail address syntax, extract phone number or ZIP-code from unformatted text or any necessary information from web-pages and all You can imagine!

The following regular expression characters are supported by mygrep:

^ Circumflex - Constrain to start of a line
A circumflex as the first character of the string constrains matches to start of lines.

Examples:

^If Match lines beginning with If

^MyGrep Match lines beginning with MyGrep

$ Dollar - Constrain to end of a line
A dollar as the last character of the string constrains matches to end of lines.

Examples:

yours$ Match lines ending with yours

^end$ Match lines consisting of the single word end

. Period - Match any character
A period anywhere in the string matches any single character.

Examples:

A.c Matches Abc, Anc, Acc etc.

L..t Matches List, Last, Harp etc.

* Asterisk - Match 0 or more
An expression followed by a asterisk matches zero or more occurrences of that expression.

Examples:

to* Matches t, to, too etc.

00* matches 0, 00, 000, 0000 etc.

+ Plus - Match one or more
An expression followed by a plus sign matches one or more occurrences of that expression.

Examples:

To+ Matches to, too etc.

32+ Matches 32, 322, 32222 etc.

/([0-9]+/) Matches (0), (12), (1234567) etc.

? Question mark - Optionally match
An expression followed by a question mark optionally matches that expression.

Examples:

To? Matches T and To

20? Matches 2 and 20

() Brackets - Expression group
Brackets can be used to group characters together prior to using a * + or ?.

Examples:

Win(dows)? Matches Win and Windows

B(an)*a Matches Ba, Bana and Banana

[ ] Square brackets - Character group
A string enclosed in square brackets matches any character in that string, but no others. If the first character of the string is a circumflex the expression matches any character except the characters in the string. A range of characters may be specified by two characters separated by a -. These should be given in ASCII order (A-Z, a-z, 0-9 etc.)

Examples:

{[0-9]} Matches {0}, {4}, {5} etc.

/([0-9]+/) Matches (100), (342), (4), (23456) etc.

H[uo]w Matches Huw and How

Gre[^py] Matches Green, Great etc. but not Grep, Grey etc.

[z-a] Matches nothing

^[A-Z] Match lines beginning with an upper-case letter

\ Backslash - Quote next character
A backslash quotes any character. This allows a search for a character that is usually a regular expression specifier.

Examples:
\$ Matches a dollar sign $
\+ Matches a +

Some predefined Metacharacters:
\xnn Matches char with hex code nn
\x{nnnn} Matches char with hex code nnnn (one byte for plain text and two bytes for Unicode)
\t Matches tab (HT/TAB), same as \x09
\n Matches newline (NL), same as \x0a
\r Matches car.return (CR), same as \x0d
\f Matches form feed (FF), same as \x0c
\a Matches alarm (bell) (BEL), same as \x07
\e Matches escape (ESC), same as \x1b
\w Matches an alphanumeric character (including "_")
\W Matches a nonalphanumeric
\d Matches a numeric character
\D Matches a non-numeric
\s Matches any space (same as [ \t\n\r\f])
\S Matches a non space

() sub-expressions

The bracketing construct ( ... ) may also be used for define r.e. sub-expressions.

Subexpressions are numbered based on the left to right order of their opening parenthesis.
First sub-expression has number '1' (whole r.e. match has number '0' - You can substitute it in Regular Expression. Substitute as '$0' or '$&').

Examples:
(foobar){8,10} matchs strings which contain 8, 9 or 10 instances of the 'foobar'
foob([0-9]|a+)r matchs 'foob0r', 'foob1r' , 'foobar', 'foobaar', 'foobaar' etc.

| alternatives

You can specify a series of alternatives for a pattern using "|'' to separate them, so that fee|fie|foe will match any of "fee'', "fie'', or "foe'' in the target string (as would f(e|i|o)e). The first alternative includes everything from the last pattern delimiter ("('', "['', or the beginning of the pattern) up to the first "|'', and the last alternative contains everything from the last "|'' to the next pattern delimiter. For this reason, it's common practice to include alternatives in parentheses, to minimize confusion about where they start and end.

Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching foo|foot against "barefoot'', only the "foo'' part will match, as that is the first alternative tried, and it successfully matches the target string. (This might not seem important, but it is important when you are capturing matched text using parentheses.)

Also remember that "|'' is interpreted as a literal within square brackets, so if You write [fee|fie|foe] You're really only matching [feio|].

Examples:
foo(bar|foo) matchs strings 'foobar' or 'foofoo'.

http://www.mygrep.com/