69
ABBY Y FineR ea der 10 U se r’s Guide
64
Regular Expressions
The table below lists the regular expressions that can be used to create a new language.
Item name
Conventional regular
expression symbol
Usage examples and explanations
Any Character
.
c.t — denotes "cat," "cot," etc.
Character from
Group
[]
[b–d]ell — denotes "bell," "cell," "dell," etc.
[ty]ell — denotes "tell" and "yell"
Character not from
Group
[^]
[^y]ell — denotes "dell," "cell," "tell," but forbids "yell"
[^n–s]ell — denotes "bell," "cell," but forbids "nell," "oell,"
"pell," "qell," "rell," and "sell"
Or
|
c(a|u)t — denotes "cat" and "cut"
0 or More Matches
*
10* — denotes numbers 1, 10, 100, 1000, etc.
1 or More Matches
+
10+ — allows numbers 10, 100, 1000, etc., but forbids 1
Letter or Digit
[0–9a–zA–Zа–яА–Я]
[0–9a–zA–Zа–яА–Я] — allows any single character
[0–9a–zA–Zа–яА–Я]+ — allows any word
Capital Latin Letter
[A–Z]
Small Latin Letter
[a–z]
Capital Cyrillic
letter
[А–Я]
Small Cyrillic letter
[а–я]
Digit
[0–9]
Space
\s
@
Reserved.
Note:
1. To use a regular expression symbol as a normal character, precede it with a backslash. For example, [t–v]x+ stands for tx, txx,
txx, etc., ux, uxx, etc., but \[t–v\]x+ stands for [t–v]x, [t–v]xx, [t–v]xxx, etc.
2. To group regular expression elements, use brackets. For example, (a|b)+|c stands for c or any combinations like abbbaaabbb,
ababab, etc. (a word of any non–zero length in which there may be any number of a's and b's in any order), while a|b+|c stands
for a, c, and b, bb, bbb, etc.
Examples
You are recognizing a table with three columns: the first for the birth date, the second for the name, and the third for the e–mail
address. You can create new languages, Data and Address, and set regular expressions for them.
Regular expression for dates:
The number denoting a day may consist of one digit (1, 2, etc.) or two digits (02, 12), but it cannot be zero (00 or 0). The regular
expression for the day should then look like this: ((|0)[1–9])|([1|2][0–9])|(30)|(31).
The regular expression for the month should look like this: ((|0)[1–9])|(10)|(11)|(12).
The regular expression for the year should look like this: ([19][0–9][0–9]|([0–9][0–9])|([20][0–9][0–9]|([0–9][0–9]).
What is left is to combine all this together and separate the numbers by period (like 1.03.1999). The period is a regular expression
symbol, so you must put a backslash (\) before it. The regular expression for the full date should then look like this:
((|0)[1–9])|([1|2][0–9])|(30)|(31)\.((|0)[1–9])|(10)|(11)|(12)\.((19)[0–9][0–9])|([0–9][0–9])|([20][0–9][0–9]|([0–9][0–9])
Regular expression for the e–mail addresses:
[a–zA–Z0–9_\–\.]+\@[a–z0–9\.\–]+