Parse the page (Collect data from the page)

Table of contents


Description

The " Parse data" action allows you to obtain the necessary information from the data source, according to a variety of conditions, which makes the search more flexible, and therefore more accurate.

How to add an action to a project?

Through the context menu: Add ActionTabsParse the Page

Or use smart search .


Usage example

An illustrative example of how the " Collect data from a page " action works is the requirement:

- Collect all visible links from the current domain page.

As a result, we will get the data into the project list, which will contain only unique internal links of the current page. And using the Remove duplicates function, we will leave only unique links in the list.


Detailed overview of the "Parse data" action properties window

Double-clicking the left mouse button on the " Parse data " action (in the project workspace) will open the " Action Properties " window, which is logically divided into 2 parts. The basis, in each data collection, is the data source (from which we obtain data for the subsequent collection of information).

Main data sources

  • Variable

  • Active tab (Current page of ZennoPoster browser)

Variable

After selecting the data source "Variable ", the following list of parameters will appear:

  • Variable name - the project variable that contains the HTML code.

  • Selector type - query language: XPath or CSS Selector .

  • Selector is a path that tells you which element (or elements) of a web page should be addressed using the query language: XPath or CSS Selector .

  • Attribute is a property of an HTML tag that needs to be obtained in the collection descent.

  • Filter result - boolean value, if the checkbox is checked, then you can use the condition to the object: Contains, Does not contain, Regex (regular expression).

  • Range is a condition by which you can select data from an array of objects.

  • Save result - after the end of data collection, place the result in a variable or list .

Active tab (current page)

After selecting the data source "Active Tab ", the following list of parameters will appear:

Similarly to the data source "Variable ", with the only difference:

  • Data type (source: DOM, Html, what is the difference?) - from which we get data to work with objects (s).

  • Visible items only - those objects that are available within the visible area of the current page.

  • Search in all frames (from the English frame) - an independent, embedded HTML document, which may contain the necessary data, or vice versa.


Fast way to collect data

An alternative way to quickly configure data collection is located in the context menu of the "Element tree" panel → item " Parse data ". In the window that opens, you can, in a few clicks of the mouse, set the search parameters and start immediately collecting information, and all this in a couple of clicks, and yes, without much knowledge of the XPath query language or CSS Selector .