Web Development Header

INFO PAGE

Regular Expressions

Overview

RegularExpressions are a way to pattern match and if desired replace patterns. It is a strange little language with lots of short powerful symbols. It originated in the PERL language but all languages have it now including PHP, JavaScript and Java. In JavaScript, you apply RegularExpressions through match(), replace(), search() and split() methods. There is a new RegExp() or a literal version /RegExp Here/

There is now a pattern parameter that takes a RegExp as a value (without the //). Here is an example to check to see if a field starts with the value december:

<input type="text" name="test" pattern="december.*">

Unfortunately, this is case sensitive. In normal RegExp we have an i modifier /RegExp/i to ignore case. Also normally, we would start this regular epxression with a $ to match the start of the string. But the pattern parameter defaults to match from the start to the end of the string. So if we just put december in as a value, it would match only if december was the only thing typed in the field. That is why we need to add .* meaning 0 or more of any character after the word december.

*** Another difference is that the pattern will always let nothing be entered. Even if we ask for one or more characters. To solve this, we should add the required parameter to the field.

REGULAR EXPRESSION INFORMATION

01 \ Escapes any special character. Eg. \. means a . not any character.

02 . A single character (any character).

03 * Zero or more of the previous character. Eg. a* zero or more a characters.

04 + One or more of the previous character.

05 ? Zero or one of the previous character.

06 () A group.

07 {n} n times of the previous character.

08 {n,} n or more times of the previous character.

09 {n,m} n to m of the previous character.

10 [ ] Any character in this list.

11 [^] Not any character in this list.

12 | OR - (good|bad)

12 - Range - used in []. Eg. [a-zA-Z0-9] - would match a-z, A-Z or 0-9

13 \d Decimal.

14 \D Not a Decimal.

15 \s Space.

16 \S Not a space.

17 \w Word character. (A-Z, a-z, 0-9, _)

18 \W Not a word character.

19 \t Tab.

20 \n New Line.