Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This page has been translated automatically

We want to provide you with the latest help content in your language as soon as possible. This page has been translated automatically and may contain grammatical errors or inaccuracies. We want this content to be useful to you. Please let us know at the bottom of this page if this information was helpful.

View the original article in Russian: Обработка текста

Описание

This block serves various manipulations with text, which are very often required in practice. Process the parsed text, clean it of garbage, translate it into other languages - all this, and much more, can be done by a text processing "cube".

Where is word processing applied?

How do I use the action?

The properties window consists mainly of three areas:

  1. The input string is text, a variable, or a combination of both.

  2. Actions on the string, properties and their settings.

  3. The output string (result) in a variable.

Place the cursor in the input line area, press Ctrl + Space and select useful constants and project variables from the drop-down list. For example, you can quickly insert a proxy of the project {-Project.Proxy-} or the URL of the active tab {-Page.Url-}

All possible operations with this "cube":

Escape strings

Escaping characters. This action replaces the characters * +? | {[() ^ $. # And space with escape codes. This technique is often used to work with queries and for the regular expression engine to use these characters literally, rather than as commands or metacharacters.

Before Application: {"animal": "cat"}
After: \ {"animal": \ "cat"}


Regex

Processing text with regular expressions. Regulars are very convenient for parsing strings to find the required substring for a given pattern. This action allows you to parse not only the first found value, but also the entire group and save the values to variables or a table. Also, optionally, if nothing is found, the result will be an error and exit on the red branch. In total, there are six options for saving the results after processing with a regular expression:

  • only the first found value is saved;

  • all found matches are saved to the list;

  • one value is saved, but either the last or random;

  • one or more values are stored in the list, but at a specific index (ordinal in the list of found values). Indexes can be listed with commas (4,5,9), set the interval through hyphens (4-9), or a combination of the above methods (4,5, 9-11);

  • the same as in the previous paragraph, but without a list, and the value of each found index can be put into its own variable;

  • matches are saved to the table.

To create regular expression patterns, ZennoPoster provides a very convenient tool - Regular Expression Constructor .

Let's look at a specific example of parsing links by regular expressions, composed using this constructor.

For example, we have a task - to parse links to profiles of active users of the ZennoLab forum . Let's get started:

  1. With the help of the cube “Getting the value” we get the HTML code of the element in which the links to the users on the online forum are placed.

  2. Add the “Regex” action. To compose the pattern used in the properties of the “Regex” action, use the Regular Expression Constructor .

  3. Add the “html“ variable to the input in the action properties, and save the result to the “urls” list.

  4. After starting the cube, we get unique id in the list, which can be used to generate the URL of user profiles.


Spintax

Randomization or uniqueness of text. With the help of spintax it is convenient to create synonymization of texts. Spinax is a construction of curly braces and vertical slashes that allows you to randomly substitute substrings from a string. In its simplest form, the spintax looks like this: {variant1 | variant2 | variant3} . When performing this action, one of three options will accidentally fall into the resulting variable. But spintax constructions can be more complex and have multi-level nesting, which is why you can get thousands of different variants from one text.

Also Spintax in ZennoPoster supports extended syntax:

  • {Red | White | Blue} - one of the values is included in the resulting text, for example: "White"

  • [Red | White | Blue] - the resulting text contains a permutation of values, for example: "White Blue Red"

  • [+ _ + Red | White | Blue] - the resulting text contains a permutation of values between which a separator is inserted, for example: "White_Red_Blue"

Nesting of templates is unlimited (for example: [+ {_ | -} + Red | White | Blue {1 | 2}] = "White-Blue 2-Red"). Special characters can be escaped: [+ \ ++ Red | \ [White \] | Blue] - result "[White] + Red + Blue"


Split

Separation of text by any separator character (delimeter). This processing turns the string into an array of strings. In fact, this is a simpler analogue of RegExp for separating a string with characters.

Let's consider the work of a split using an example of a very common task - getting a login and password from a string. Usually, accesses to various accounts are stored in the form of line-by-line lists in the format - login: password . And here the delimiter is the colon symbol :

We insert into the input field our string or a variable containing it. In the properties, specify the separator - :, and below we assign a separate variable to each element of the resulting array of substrings. After processing the line, we get a login and password in each variable.

Also, a frequently used functionality is parsing a proxy string with password authorization. The proxy line in ZennoPoster has the format: login: password @ host: port . To get an IP proxy, for example, you will have to use the "cube" twice. First, we divide the proxy string into two substrings using the @ separator and put the second into a temporary variable. Then we divide the second substring using the separator : and take the first substring, which will be our desired proxy IP.


ToChar

Converts an integer value to Unicode characters .
Each Unicode character has its own numeric code and this functionality allows you to convert a numeric value to the corresponding characters. For example, the symbol ♛ has a numeric value 9819


ToLower

Changes letters to lowercase depending on the selected property: either all letters, or only the first letter of the string, or the first letter in each word.

For example, we are VERY ANNOYED BY WRITING CAPS LOCK . We pass the text through this macro and get the normal spelling.


ToUpper

Reverse action of the previous one. Changes the case of letters to uppercase: either all letters, or only the first letter of the string, or the first letter in each word.

For example, we need All Words in the Heading to Start With Capital Letters (a frequent reception on foreign resources) - we use this “cube” in the “First letter in every word” mode.


Trim

This is the trimming of extra characters in the string. It is mainly used when you need to clean up a string from extra spaces, line breaks, tabs, which so often remain as a result of parsing. In this case, you can clean from the beginning of the line, from the end and simultaneously. In addition, you can crop your own characters, for example, a frequent task is to remove dots in titles.

Was: some text (spaces at the beginning and at the end)
Now: some text


UrlDecode

Decodes a URL-encoded string.

If characters such as spaces and punctuation marks are passed in the HTTP request, then they may be misinterpreted by the receiving side. Therefore, this macro is necessarily used when generating URL requests.

This action looks most obvious when decoding the Cyrillic alphabet:
Was: % D0% 9F% D1% 80% D0% B8% D0% B2% D0% B5% D1% 82% 2C% 20% D0% BC% D0% B8% D1% 80% 21
It became: Hello world!


UrlEncode

The function is the opposite of the previous one. Encodes a URL string. Often used for HTTP requests.
It was: https://zennolab.com/
Now: https% 3a% 2f% 2fzennolab.com% 2f


Into a variable

This action simply saves everything that you add to the input window - variables, text, symbols, project constants, into a separate variable.


To the list

This action splits the text using the specified in the properties of the delimiter into lines and writes them to the list.


To the table

Much the same as in the previous action, but saves the data in a table. Naturally, in this case it is necessary to specify separators not only for rows, but also for columns. For example, for tables in Excel format, you need to specify your character as a column separator and enter in the field next to {-String.Tab-}


Replacement

This action searches the string for a substring, replaces it with another, and then saves the result to a variable. Everything seems to be simple, but thanks to the possibility of using regular expressions, the potential of this action is greatly expanded.  As well as in other actions, you can replace not only the first found match, but all or specified indices (separated by commas or a range specified by a hyphen).


Transfer

Translates strings from one language to another.

The translation action has a large selection of translation services, which will help you flexibly approach the uniqueization of texts by choosing the highest quality texts.

The following APIs are available:

In addition to choosing the API that will translate, it is important to indicate the language of the source and the final translation. Here are some examples: English - en, Spanish - es, German - de, Russian - ru ( full list )
You can specify the language “auto” and then the system will try to determine the target language itself, but the result is not guaranteed.

Additional parameters can significantly expand the capabilities of this "cube", but each API has its own. For example, by passing the API key, you can achieve more stable translator work.

API keys for services can be added in ZennoPoster settings.

Preparing JavaScript

Processes a string for correct use in JavaScript. Mostly escapes quotes and other specials. symbols. This macro prepares text so that it can be inserted as a string in a JavaScript or IF action. ProjectMaker has a JavaScript tester where you can inspect (test) your code. This "cube" will help you to escape quotes, apostrophes and other special characters.

It was: <a href="https://zennolab.com/">
Now : <a href=\"https://zennolab.com/\">


Substring

Takes a piece of text from a string specified in the action properties by two indices - from one character to another. For example, if you take the first sentence of this paragraph and there is a task to get a substring in it from 95 characters to the end of the text, then we get “to another.”.


Transliteration

Sometimes it is still required to perevesti Cyrillic to Latin. This action also serves this action.

  • No labels