Regular expression quick start guide

From TBwiki
(Difference between revisions)
Jump to: navigation, search
Line 13: Line 13:
 
*Must be escape with backslash "\" to use a literal characters.
 
*Must be escape with backslash "\" to use a literal characters.
  
|  
+
| <br>
 
|-
 
|-
 
|  
 
|  
Line 25: Line 25:
 
| ''/a/'' matches "a"<br>
 
| ''/a/'' matches "a"<br>
 
|-
 
|-
| <br>  
+
| [characters]<br>  
| <br>  
+
| Character classes or character set. A character class matches a single character out of all the possibilities offered by the character class.<br>  
| <br>
+
| ''/[0-9]/'' matches a single digit<br>
 +
|-
 +
| [\d]<br>
 +
| Shorthand character classes matching digits. Same as [0-9].<br>
 +
| ''/[\d]/'' matches a single digit<br>
 +
|-
 +
| .<br>
 +
| Dot matches any characters.<br>
 +
| ''/a.c/'' matches both "a4c" and "ayc"'''<br>'''
 
|-
 
|-
 
| ^  
 
| ^  
| Matches beginning of a line
+
| Matches at the start of the string the regex pattern is applied to. Matches a position rather than a character.
 
| <br>
 
| <br>
 
|-
 
|-
 
| $  
 
| $  
| Matches the end of a line
+
| Matches at the end of the string the regex pattern is applied to. Matches a position rather than a character.
 
| <br>
 
| <br>
 
|-
 
|-
| \d
+
| {m,n}<br>
| Matches a digit
+
| Matches at least “m” and at most “n” occurrences of preceeding character, character class or group.<br>
 
| <br>
 
| <br>
 
|-
 
|-
| [characters]
+
| *
| Matches any single character between the brackets
+
| Matches '''zero '''or '''more&nbsp;'''occurrences of preceeding character, character class or group.
 
| <br>
 
| <br>
 
|-
 
|-
| re1&#124;re2
+
| +
| Match either re1 or re2
+
| Matches '''one '''or '''more '''occurrences of preceeding character, character class or group.'''<br>'''
 
| <br>
 
| <br>
 
|-
 
|-
| re*
+
| ?
| Matches zero or more occurrences of re
+
| Matches '''zero '''or '''one '''occurrences of&nbsp; preceeding character, character class or group.
 
| <br>
 
| <br>
 
|-
 
|-
| re+
+
| ()  
| Matches one or more occurrences of re
+
| Parentheses are used for group or capturing group<br>
| <br>
+
|-
+
| re?
+
| Matches zero or one occurrences of re
+
| <br>
+
|-
+
| Re{m,n}
+
| Matches at least “m” and at most “n” occurrences of re
+
| <br>
+
|-
+
| (...)  
+
| Parentheses are used to group regular expressions
+
 
| <br>
 
| <br>
 
|-
 
|-
Line 83: Line 79:
  
 
=== Metacharacters  ===
 
=== Metacharacters  ===
 +
 +
  
 
=== Literal Characters<br>  ===
 
=== Literal Characters<br>  ===

Revision as of 17:59, 26 January 2010

Contents

Quick Reference Table

Regular Expression Pattern Explanations Examples
Meta characters  [\^$.|?*+(

Special caracters used in regex.

  • Must be escape with backslash "\" to use a literal characters.

Literal characters

All characters (except the metacharacters) match a single instance of themselves.

  • { and } are literal characters, unless they're part of a valid regular expression token (e.g. the {n} quantifier).
/a/ matches "a"
[characters]
Character classes or character set. A character class matches a single character out of all the possibilities offered by the character class.
/[0-9]/ matches a single digit
[\d]
Shorthand character classes matching digits. Same as [0-9].
/[\d]/ matches a single digit
.
Dot matches any characters.
/a.c/ matches both "a4c" and "ayc"
^ Matches at the start of the string the regex pattern is applied to. Matches a position rather than a character.
$ Matches at the end of the string the regex pattern is applied to. Matches a position rather than a character.
{m,n}
Matches at least “m” and at most “n” occurrences of preceeding character, character class or group.

* Matches zero or more occurrences of preceeding character, character class or group.
+ Matches one or more occurrences of preceeding character, character class or group.

 ? Matches zero or one occurrences of  preceeding character, character class or group.
() Parentheses are used for group or capturing group

\0, \1, \2, ... Substitute the value matched by the nth grouped sub-expression, used in remapped fields.


Quick References

Text Patterns and Matches


Metacharacters

Literal Characters


Character Classes or Character Sets


Shorthand Character Classes


The Dot Matches (Almost) Any Character


Repetition


Optional


Anchors


Alternation


Grouping and Capturing Group


Examples

Here are some examples:

Add 2720 prefix:

/(\d+)/2720\1/

or

/([0-9]*)/2720\1/

Strip first 4 digits:

/([0-9]{4})([0-9]*)/\2/

Strip # and 7 first digits:

/([#])([0-9]{7})([0-9]*)/\3/

Web Online Tools

References


Personal tools