Regular expression quick start guide

From TBwiki
(Difference between revisions)
Jump to: navigation, search
(Added NOT operator)
 
(15 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
{| cellpadding="2" border="1"
 
{| cellpadding="2" border="1"
 
|-
 
|-
! align="left" width="250" | Regular Expression Pattern  
+
! width="250" align="left" | Regular Expression Pattern  
! align="left" | Explanations
+
! align="left" | Explanations  
 +
! align="left" | Examples<br>
 
|-
 
|-
| ^  
+
| Meta characters&nbsp; [\^$.&#124;?*+(<br>
| Matches beginning of a line
+
|  
 +
Special caracters used in regex.
 +
 
 +
*Must be escape with backslash "\" to use a literal characters.
 +
 
 +
| <br>
 
|-
 
|-
| $
+
|  
| Matches the end of a line
+
Literal characters<br>
 +
 
 +
|  
 +
All characters (except the metacharacters) match a single instance of themselves. <br>
 +
 
 +
*{ and } are literal characters, unless they're part of a valid regular expression token (e.g. the {n} quantifier).
 +
 
 +
| ''/a/'' matches "a"<br>
 
|-
 
|-
| \d
+
| [characters]<br>
| Matches a digit
+
| Character classes or character set. A character class matches a single character out of all the possibilities offered by the character class.<br>
 +
| ''/[0-9]/'' matches a single digit<br>
 
|-
 
|-
| [characters]  
+
| [\d]<br>
| Matches any single character between the brackets
+
| Shorthand character classes matching digits. Same as [0-9].<br>
 +
| ''/[\d]/'' matches a single digit<br>
 
|-
 
|-
| re1&#124;re2
+
| .<br>
| Match either re1 or re2
+
| Dot matches any characters.<br>
 +
| ''/a.c/'' matches both "a4c" and "ayc"'''<br>'''
 
|-
 
|-
| re*
+
| ^
| Matches zero or more occurrences of re
+
| Matches at the start of the string the regex pattern is applied to. Matches a position rather than a character.
 +
| <br>
 
|-
 
|-
| re+
+
| $
| Matches one or more occurrences of re
+
| Matches at the end of the string the regex pattern is applied to. Matches a position rather than a character.
 +
| <br>
 
|-
 
|-
| re?
+
| {m,n}<br>
| Matches zero or one occurrences of re
+
| Matches at least “m” and at most “n” occurrences of preceeding character, character class or group.<br>
 +
| <br>
 
|-
 
|-
| Re{m,n}
+
| *
| Matches at least “m” and at most “n” occurrences of re
+
| Matches '''zero '''or '''more&nbsp;'''occurrences of preceeding character, character class or group.
 +
| <br>
 
|-
 
|-
| (...)  
+
| +
| Parentheses are used to group regular expressions
+
| Matches '''one '''or '''more '''occurrences of preceeding character, character class or group.'''<br>'''
 +
| <br>
 +
|-
 +
| &nbsp;?
 +
| Matches '''zero '''or '''one '''occurrences of&nbsp; preceeding character, character class or group.  
 +
| <br>
 +
|-
 +
| ()  
 +
| Parentheses are used for group or capturing group<br>
 +
| <br>
 
|-
 
|-
 
| \0, \1, \2, ...  
 
| \0, \1, \2, ...  
| Substitute the value matched by the nth grouped sub-expression, used in remapped fields.
+
| Substitute the value matched by the nth grouped sub-expression, used in remapped fields.  
 +
| <br>
 +
|-
 +
| ?!
 +
| Not, as in "everything except this". 
 +
| <br>
 
|}
 
|}
  
<br>  
+
<br> Examples<br>  
 
+
== Quick References  ==
+
 
+
=== Text Patterns and Matches<br>  ===
+
 
+
<br>
+
 
+
=== Literal Characters<br>  ===
+
 
+
<br>
+
 
+
=== Character Classes or Character Sets<br>  ===
+
 
+
<br>
+
 
+
=== Shorthand Character Classes<br>  ===
+
 
+
<br>
+
 
+
=== The Dot Matches (Almost) Any Character<br>  ===
+
 
+
<br>
+
 
+
=== Repetition<br>  ===
+
 
+
<br>
+
 
+
=== Optional<br>  ===
+
 
+
<br>
+
 
+
=== Anchors<br>  ===
+
 
+
<br>
+
 
+
=== Alternation<br>  ===
+
 
+
<br>
+
 
+
=== Grouping and Capturing Group<br>  ===
+
 
+
<br>
+
 
+
== Examples<br> ==
+
  
 
Here are some examples:  
 
Here are some examples:  
Line 90: Line 80:
 
Add 2720 prefix:  
 
Add 2720 prefix:  
  
  /(\d+)/2720\1/
+
  /^(\d+)$/2720\1/
  
 
or  
 
or  
  
  /([0-9]*)/2720\1/
+
  /^([0-9]*)$/2720\1/
  
 
Strip first 4 digits:  
 
Strip first 4 digits:  
  
  /([0-9]{4})([0-9]*)/\2/
+
  /^([0-9]{4})([0-9]*)$/\2/
  
 
Strip # and 7 first digits:  
 
Strip # and 7 first digits:  
  
  /([#])([0-9]{7})([0-9]*)/\3/
+
  /^([#])([0-9]{7})([0-9]*)$/\3/
  
== Online Tools  ==
+
== Web Online Tools  ==
  
*Regex builder tool :&nbsp;[http://www.gskinner.com/RegExr www.gskinner.com/RegExr]
+
*Regular builder tool (with replace) &nbsp;:&nbsp;[http://www.gskinner.com/RegExr www.gskinner.com/RegExr]
*Ruby regular expression editor and tester : [http://rubular.com rubular.com]
+
Tips to use:
 +
# No need to enclose the Regular expression with '/'.
 +
# Replace : the regular expression must be split in two parts and '\' replaced by '$'. For example, with '/^([0-9]*)$/2720\1/', the '^([0-9]*)$' would be filled on the first line and '2720$1' on the second line.
 +
 
 +
*Ruby regular expression editor and tester&nbsp;: [http://rubular.com rubular.com]<br>
  
 
== References  ==
 
== References  ==
  
*[[Toolpack: How to Use RegEx in Called and Calling Number Mask|How to Use RegEx in Called and Calling Number Mask]]
 
 
*[[Toolpack: How to Use RegEx in Remapped Called and Calling Number Mask|How to Use RegEx in Remapped Called and Calling Number Mask]]
 
*[[Toolpack: How to Use RegEx in Remapped Called and Calling Number Mask|How to Use RegEx in Remapped Called and Calling Number Mask]]
 +
 +
<br>

Latest revision as of 10:04, 31 January 2014

Quick Reference Table

Regular Expression Pattern Explanations Examples
Meta characters  [\^$.|?*+(

Special caracters used in regex.

  • Must be escape with backslash "\" to use a literal characters.

Literal characters

All characters (except the metacharacters) match a single instance of themselves.

  • { and } are literal characters, unless they're part of a valid regular expression token (e.g. the {n} quantifier).
/a/ matches "a"
[characters]
Character classes or character set. A character class matches a single character out of all the possibilities offered by the character class.
/[0-9]/ matches a single digit
[\d]
Shorthand character classes matching digits. Same as [0-9].
/[\d]/ matches a single digit
.
Dot matches any characters.
/a.c/ matches both "a4c" and "ayc"
^ Matches at the start of the string the regex pattern is applied to. Matches a position rather than a character.
$ Matches at the end of the string the regex pattern is applied to. Matches a position rather than a character.
{m,n}
Matches at least “m” and at most “n” occurrences of preceeding character, character class or group.

* Matches zero or more occurrences of preceeding character, character class or group.
+ Matches one or more occurrences of preceeding character, character class or group.

 ? Matches zero or one occurrences of  preceeding character, character class or group.
() Parentheses are used for group or capturing group

\0, \1, \2, ... Substitute the value matched by the nth grouped sub-expression, used in remapped fields.
 ?! Not, as in "everything except this".


Examples

Here are some examples:

Add 2720 prefix:

/^(\d+)$/2720\1/

or

/^([0-9]*)$/2720\1/

Strip first 4 digits:

/^([0-9]{4})([0-9]*)$/\2/

Strip # and 7 first digits:

/^([#])([0-9]{7})([0-9]*)$/\3/

Web Online Tools

Tips to use:

  1. No need to enclose the Regular expression with '/'.
  2. Replace : the regular expression must be split in two parts and '\' replaced by '$'. For example, with '/^([0-9]*)$/2720\1/', the '^([0-9]*)$' would be filled on the first line and '2720$1' on the second line.
  • Ruby regular expression editor and tester : rubular.com

References


Personal tools