Regular expressions are a kind of filter for finding text strings that match the required conditions. The built-in regular expression tester will allow you to quickly create rules without going into all the subtleties of their compilation.
Extracting information from site pages
Filtering data in lists, tables
Search for letters and/or links to confirm registration;
Search for a specific fragment in the text
Search for lines to delete in lists;
And many other useful uses
To compile them, you can use the Regular Expression Tester helper. It can be found on the top panel of the program or in the menu Tools → Regexp tester .
You can work on multiple regexes at the same time in different tabs. The regular expression text is used as the tab name.
All expressions that you tested using the Test button are saved here.
This will contain the text of the regular expression. You can edit the text in this field.
When you make changes to the fields and checkboxes from the Regular Expressions Wizard group, all the edits you made to the text of the expression manually will be lost! |
After clicking, the expression from the Regular Expression Text field will be applied to the Text for processing. What came out of this can be found in the Processing result.
This text is searched for, but will not be included in the result of the expression.
This text is included in the result of the work.
Enabling/disabling multi-line search.
When this option is enabled, in the results we will get the shortest substring that matches the composed expression.
In this field you need to enter the text by which the text will be searched.
Data in this field can be entered directly from the project variable: right-click => Set value from variable.
The ability to set a value from a variable was added in ZennoPoster 7.4.0.0 |
The drop-down list will display the variables of the currently active project.
Is it necessary to display line breaks, tabs (and some other characters) as special characters?
Turned off
Turned on
The result of applying the regular expression to the text will be displayed here.
This will include the results of the work in the case of using Group regular expressions. An example of such expressions can be found in the description of the action Text processing => Regex => To Variable.
Let's take an example of a specific and common task - link parsing. Let's say we got the HTML of some DIV or the entire DOM of the page and we need to parse all links from this code and save them to a list.
We insert into the field our source code in which we will search for links (you can quickly insert the code of the current active tab into the Tester using the Page Text Viewer window).
Let's indicate the substring that usually comes before the link, namely the tag a href = ”
.
Let's add quotes to close the link string. Do not forget the "Shortest match" checkbox, because we need to collect the string only between the two extreme quotes.
Press the "Test" button and the required list of links will appear in the "Processing Result" field (if there are matches). If something goes wrong, try changing your search terms.
We can copy the ready-made regular expression and apply it to our template. For example, in action Text Processing -> Regex
The regular expression searches for as many substrings as there are in the text. If you need to take a specific match number, use ranges.
The regular expression searches for as many substrings as there are in the text. If you need to take a specific match number, use ranges.
Most of the characters in the regex represent themselves with the exception of special characters[
]
\
/
^
$
.
|
?
*
+
(
)
{
}
, which can be escaped with \
(backslash) to represent themselves as text characters. That is, the simplest regular expression can be written like this: abc
, which will match the string abc .
Specialist. symbol | Value | Example | Conformity |
---|---|---|---|
* | Repetitions 0 or more |
|
|
. | Any single character, excluding the newline character |
|
|
+ | Repetitions 1 or more |
|
|
? | Repetitions 0 or 1 |
|
|
| | OR operator |
|
|
() | Grouping |
|
|
[] | List of symbols, one of which may be present in the text |
|
|
[^] | List of characters that are not included in the specified set |
|
|
- | Character range (used in square brackets) |
|
|
^ | Start of line |
|
|
$ | End of line |
| aaa aa |
{} | The number of repetitions of the previous character. |
|
|
\ | Escaping special characters |
|
|
\b | Word border |
| aa |
\B | Not a word boundary |
| a |
\s | Whitespace character |
|
|
\S | Non-whitespace character |
|
|
\d | Digital symbol |
| abc |
\D | Non-digital symbol |
|
|
\w | Alphanumeric character, including _ |
|
|
\W | Any character other than alphabetic, numeric, or _ |
| 123 |
\r | Carriage return | ||
\n | Line translation | ||
\t | Tab character |
Modifiers are in effect from the moment they occur until the end of the regular expression or the opposite modifier.
Modifier | Описание | |
---|---|---|
(?i) | Includes | case insensitive |
(?-i) | Turns off | |
(?s) | Includes | mode of dot matching line break characters. |
(?-s) | Turns off | |
(?m) | Multi-line search. | after and before newlines |
(?-m) | with beginning and end of text | |
(?x) | Includes | mode does not take into account spaces between parts of the regular expression and allows you to use |
(?-x) | Turns off |
Search for a piece of text, "looking through" (but not including in the found) the surrounding text, which is located before or after the desired piece of text. Negative scan is used less often and "makes sure" that the specified matches, on the contrary, do not occur before or after the desired text fragment.
Representation | View type | Example | Conformity |
---|---|---|---|
(? = pattern) | Positive look ahead |
| Louis XV, Louis XVI, |
(?!template) | Negative lookahead (with negation) |
|
|
(? <= pattern) | Positive Look Back |
| Sergey |
(? <! pattern) | Negative look back (with negation) |
| Sergey Ivanov, Igor |
Examples of useful regexps for quickly solving the most common problems.
(?i)[A-Z0-9._%+-]+@[A-Z0-9-]+.+.[A-Z]{2,4} |
+?(\d{1,3})?[- .]?(?(?:\d{2,4}))?[- .]?[\d-]{5,9} |
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) |
(https?:\/\/)?([\w\.]+)\.([a-z]{2,6}\.?)(\/[\w\.]*)*\/? |
(?<=\\)[^\.\\]*(\.[^\.]+){1,}$ |
If you do not know how to write a regular expression for your situation, ask our community for help on the forum in the Newbie Questions section or in the special topic Regular expressions for all occasions .