Regular Expression Tester (ZD)

This article is copied from Zennoposter reference, because these actions are identical in both programs. Original - Regular Expression Tester
Inside this article, screenshots may meet, internal reference links and other things that belong to Zennoposter.

Table of contents


Description

Regular expressions are a kind of filter for finding text strings that match the required conditions. The built-in regular expression tester will allow you to quickly create rules without going into all the subtleties of their compilation.


Where are regular expressions used?

  • Extracting information from site pages

  • Filtering data in lists, tables

  • Search for letters and/or links to confirm registration;

  • Search for a specific fragment in the text

  • Search for lines to delete in lists;

  • And many other useful uses


How to quickly compose a regular expression in ZennoPoster?

To compile them, you can use the Regular Expression Tester helper. It can be found on the top panel of the program or in the menu Tools → Regexp tester .

Regular Expression Tester Window

Tabs

You can work on multiple regexes at the same time in different tabs. The regular expression text is used as the tab name.

History

All expressions that you tested using the Test button are saved here.

Regular expression text

This will contain the text of the regular expression. You can edit the text in this field.

When you make changes to the fields and checkboxes from the Regular Expressions Wizard group, all the edits you made to the text of the expression manually will be lost!

Test button

After clicking, the expression from the Regular Expression Text field will be applied to the Text for processing. What came out of this can be found in the Processing result.

Before the search text here should always be, This comes after the search text

This text is searched for, but will not be included in the result of the expression.

The search text always starts with, After the search text there should always be

This text is included in the result of the work.

Enable line wrap

Enabling/disabling multi-line search.

Shortest match

When this option is enabled, in the results we will get the shortest substring that matches the composed expression.

Text for processing

In this field you need to enter the text by which the text will be searched.

Data in this field can be entered directly from the project variable: right-click => Set value from variable.

The ability to set a value from a variable was added in ZennoPoster 7.4.0.0

The drop-down list will display the variables of the currently active project.

Show special characters

Is it necessary to display line breaks, tabs (and some other characters) as special characters?

Turned off

Turned on

Processing result

Matches Tab

The result of applying the regular expression to the text will be displayed here.

Groups tab

This will include the results of the work in the case of using Group regular expressions. An example of such expressions can be found in the description of the action Text processing => Regex => To Variable.


Usage example

Let's take an example of a specific and common task - link parsing. Let's say we got the HTML of some DIV or the entire DOM of the page and we need to parse all links from this code and save them to a list.

  1. We insert into the field our source code in which we will search for links (you can quickly insert the code of the current active tab into the Tester using the Page Text Viewer window).

  2. Let's indicate the substring that usually comes before the link, namely the tag a href = ” .

  3. Let's add quotes to close the link string. Do not forget the "Shortest match" checkbox, because we need to collect the string only between the two extreme quotes.

  4. Press the "Test" button and the required list of links will appear in the "Processing Result" field (if there are matches). If something goes wrong, try changing your search terms.

  5. We can copy the ready-made regular expression and apply it to our template. For example, in action Text Processing -> Regex

The regular expression searches for as many substrings as there are in the text. If you need to take a specific match number, use ranges.


Symbols with special meanings

Most of the characters in the regex represent themselves with the exception of special characters[ ] \ / ^ $ . | ? * + ( ) { } , which can be escaped with \ (backslash) to represent themselves as text characters. That is, the simplest regular expression can be written like this: abc , which will match the string abc .

Specialist. symbol

Value

Example

Conformity

*

Repetitions 0 or more

ab*c

abcabbcac

.

Any single character, excluding the newline character

a.c

aac, abc, acc

+

Repetitions 1 or more

ab+c

abcabbc

?

Repetitions 0 or 1

ab?c

abcac

|

OR operator

a|b|c

a, b, c

()

Grouping

zennolab(com)+

zennolabcom, zennolabcomcom

[]

List of symbols, one of which may be present in the text

zennoposter[57]

zennoposter5, zennoposter7

[^]

List of characters that are not included in the specified set

[^0-9]

abc 123

-

Character range (used in square brackets)

[3-7]
[hell]

3 , 4 , 5 , 6 , 7
a , b , c , d , e

^

Start of line

^a

aaa aaa

$

End of line

a$

aaa aaa

{}

The number of repetitions of the previous character.

Number of repetitions
{n} - exactly n times
{m, n} - from m to n inclusive
{m,} - less than m times
{, n} no more than n times

zen{2}oposter
(abc){2,3}

zennoposter
abcabc, abcabcabc

\

Escaping special characters

a\.b\.c

a.b.c

\b

Word border

a\b
\ba

aaa aaa
aaa aaa

\B

Not a word boundary

\Ba\B

aaa aaa

\s

Whitespace character

aaa\s?bbb

aaa bbb, aaabbb

\S

Non-whitespace character

aaa\S+

aaabc cccc

\d

Digital symbol

\d+

abc 123 abc

\D

Non-digital symbol

\D+

abc 123 abc

\w

Alphanumeric character, including _

\w+

abc, 123

\W

Any character other than alphabetic, numeric, or _

\W+

123

\r

Carriage return

\n

Line translation

\t

Tab character


Modifiers

Modifiers are in effect from the moment they occur until the end of the regular expression or the opposite modifier.

Modifier

Описание

(?i)

Includes

case insensitive

(?-i)

Turns off

(?s)

Includes

mode of dot matching line break characters.

(?-s)

Turns off

(?m)

Multi-line search.
The ^ and $ characters only match

after and before newlines

(?-m)

with beginning and end of text

(?x)

Includes

mode does not take into account spaces between parts of the regular expression and allows you to use # for comments

(?-x)

Turns off


Look forward and backward

Search for a piece of text, "looking through" (but not including in the found) the surrounding text, which is located before or after the desired piece of text. Negative scan is used less often and "makes sure" that the specified matches, on the contrary, do not occur before or after the desired text fragment.

Representation

View type

Example

Conformity

(? = pattern)

Positive look ahead

Louis (? = XVI)

Louis XV, Louis XVI, Louis XVIII, Louis LXVII, Louis XXL

(?!template)

Negative lookahead (with negation)

Louis (?! XVI)

Louis XV, Louis XVI, Louis XVIII, Louis LXVII, Louis XXL

(? <= pattern)

Positive Look Back

(? <= Sergey) Ivanov

Sergey Ivanov , Igor Ivanov

(? <! pattern)

Negative look back (with negation)

(? <! Sergey) Ivanov

Sergey Ivanov, Igor Ivanov

Collection of regular expressions

Examples of useful regexps for quickly solving the most common problems.

E-mail address

(?i)[A-Z0-9._%+-]+@[A-Z0-9-]+.+.[A-Z]{2,4}

Phone number

+?(\d{1,3})?[- .]?(?(?:\d{2,4}))?[- .]?[\d-]{5,9}

IP address

(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)

Url

(https?:\/\/)?([\w\.]+)\.([a-z]{2,6}\.?)(\/[\w\.]*)*\/?

Extracting file name and extension from path

(?<=\\)[^\.\\]*(\.[^\.]+){1,}$

If you do not know how to write a regular expression for your situation, ask our community for help on the forum in the Newbie Questions section or in the special topic Regular expressions for all occasions .