Regular Expression Tester (ZD)
This article is copied from Zennoposter reference, because these actions are identical in both programs. Original - Regular Expression Tester
Inside this article, screenshots may meet, internal reference links and other things that belong to Zennoposter.
Please read the Terms of Use for Materials on ZennoLab
Table of contents
Description
Regular expressions are a kind of filter for finding text strings that match the required conditions. The built-in regular expression tester will allow you to quickly create rules without going into all the subtleties of their compilation.
Where are regular expressions used?
Extracting information from site pages
Filtering data in lists, tables
Search for letters and/or links to confirm registration;
Search for a specific fragment in the text
Search for lines to delete in lists;
And many other useful uses
How to quickly compose a regular expression in ZennoPoster?
To compile them, you can use the Regular Expression Tester helper. It can be found on the top panel of the program or in the menu Tools → Regexp tester .
Regular Expression Tester Window
Tabs
You can work on multiple regexes at the same time in different tabs. The regular expression text is used as the tab name.
History
All expressions that you tested using the Test button are saved here.
Regular expression text
This will contain the text of the regular expression. You can edit the text in this field.
When you make changes to the fields and checkboxes from the Regular Expressions Wizard group, all the edits you made to the text of the expression manually will be lost!
Test button
After clicking, the expression from the Regular Expression Text field will be applied to the Text for processing. What came out of this can be found in the Processing result.
Before the search text here should always be, This comes after the search text
This text is searched for, but will not be included in the result of the expression.
The search text always starts with, After the search text there should always be
This text is included in the result of the work.
Enable line wrap
Enabling/disabling multi-line search.
Shortest match
When this option is enabled, in the results we will get the shortest substring that matches the composed expression.
Text for processing
In this field you need to enter the text by which the text will be searched.
Data in this field can be entered directly from the project variable: right-click => Set value from variable.
The ability to set a value from a variable was added in ZennoPoster 7.4.0.0
The drop-down list will display the variables of the currently active project.
Show special characters
Is it necessary to display line breaks, tabs (and some other characters) as special characters?
Turned off
Turned on
Processing result
Matches Tab
The result of applying the regular expression to the text will be displayed here.
Groups tab
This will include the results of the work in the case of using Group regular expressions. An example of such expressions can be found in the description of the action Text processing => Regex => To Variable.
Usage example
Let's take an example of a specific and common task - link parsing. Let's say we got the HTML of some DIV or the entire DOM of the page and we need to parse all links from this code and save them to a list.
We insert into the field our source code in which we will search for links (you can quickly insert the code of the current active tab into the Tester using the Page Text Viewer window).
Let's indicate the substring that usually comes before the link, namely the tag
a href = ”
.Let's add quotes to close the link string. Do not forget the "Shortest match" checkbox, because we need to collect the string only between the two extreme quotes.
Press the "Test" button and the required list of links will appear in the "Processing Result" field (if there are matches). If something goes wrong, try changing your search terms.
We can copy the ready-made regular expression and apply it to our template. For example, in action Text Processing -> Regex
The regular expression searches for as many substrings as there are in the text. If you need to take a specific match number, use ranges.
Symbols with special meanings
Most of the characters in the regex represent themselves with the exception of special characters[
]
\
/
^
$
.
|
?
*
+
(
)
{
}
, which can be escaped with \
(backslash) to represent themselves as text characters. That is, the simplest regular expression can be written like this: abc
, which will match the string abc .
Specialist. symbol | Value | Example | Conformity |
---|---|---|---|
* | Repetitions 0 or more |
|
|
. | Any single character, excluding the newline character |
|
|
+ | Repetitions 1 or more |
|
|
? | Repetitions 0 or 1 |
|
|
| | OR operator |
|
|
() | Grouping |
|
|
[] | List of symbols, one of which may be present in the text |
|
|
[^] | List of characters that are not included in the specified set |
|
|
- | Character range (used in square brackets) |
|
|
^ | Start of line |
|
|
$ | End of line |
| aaa aa |
{} | The number of repetitions of the previous character. |
|
|
\ | Escaping special characters |
|
|
\b | Word border |
| aa |
\B | Not a word boundary |
| a |
\s | Whitespace character |
|
|
\S | Non-whitespace character |
|
|
\d | Digital symbol |
| abc |
\D | Non-digital symbol |
|
|
\w | Alphanumeric character, including _ |
|
|
\W | Any character other than alphabetic, numeric, or _ |
| 123 |
\r | Carriage return |
|
|
\n | Line translation |
|
|
\t | Tab character |
|
|
Modifiers
Modifiers are in effect from the moment they occur until the end of the regular expression or the opposite modifier.
Modifier | Описание | |
---|---|---|
(?i) | Includes | case insensitive |
(?-i) | Turns off | |
(?s) | Includes | mode of dot matching line break characters. |
(?-s) | Turns off | |
(?m) | Multi-line search. | after and before newlines |
(?-m) | with beginning and end of text | |
(?x) | Includes | mode does not take into account spaces between parts of the regular expression and allows you to use |
(?-x) | Turns off |
Look forward and backward
Search for a piece of text, "looking through" (but not including in the found) the surrounding text, which is located before or after the desired piece of text. Negative scan is used less often and "makes sure" that the specified matches, on the contrary, do not occur before or after the desired text fragment.
Representation | View type | Example | Conformity |
---|---|---|---|
(? = pattern) | Positive look ahead |
| Louis XV, Louis XVI, |
(?!template) | Negative lookahead (with negation) |
|
|
(? <= pattern) | Positive Look Back |
| Sergey |
(? <! pattern) | Negative look back (with negation) |
| Sergey Ivanov, Igor |
Collection of regular expressions
Examples of useful regexps for quickly solving the most common problems.
E-mail address
(?i)[A-Z0-9._%+-]+@[A-Z0-9-]+.+.[A-Z]{2,4}
Phone number
+?(\d{1,3})?[- .]?(?(?:\d{2,4}))?[- .]?[\d-]{5,9}
IP address
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
Url
(https?:\/\/)?([\w\.]+)\.([a-z]{2,6}\.?)(\/[\w\.]*)*\/?
Extracting file name and extension from path
(?<=\\)[^\.\\]*(\.[^\.]+){1,}$
If you do not know how to write a regular expression for your situation, ask our community for help on the forum in the Newbie Questions section or in the special topic Regular expressions for all occasions .