Viewing page text
Please read the Terms of Use for Materials on ZennoLab
Table of contents
Description
With this tool, you can easily view the source code (Source), DOM model, and displayed text of the page loaded in the browser.
You can read the difference between DOM and page source code in the article in block Description
If you are working on the Chrome engine, then alternatively you can use the web developer tools
What is it used for?
This tool is used when you need to better understand the structure of the page:
Searching for data in order to "snap" to an element to click/get/set a value
Parsing data that cannot be reached using the action designer
Quickly copy source code, DOM or text into the Regular Expression Tester
How to open a window?
The button to enable this window is located to the right of the address bar of the browser.
How to work with a window?
When you click on the icon, a window opens:
Content selection
Here you need to choose what you want to view: DOM (default), source code or visible text on the page (the difference between Source and DOM).
Wrap by words
When this option is activated, if the line is too long, then it will be moved to the next one, and not hidden outside the window border. As an example, a screenshot of the same window, but with this option active:
Copy to regex tester
When you click on this button, the regular expression tester will be run, and the contents of the window will be automatically copied into it.
Usage example
Let's say you need to parse <meta>
tags with a property
attribute from the topic page on the ZennoLab forum . You can't get to them through the action designer . these tags are not displayed in any way. Our actions:
Go to the required page
We run the code view window (in this case, you can use both the DOM and the source code, this will not affect the final result in any way) and look at the necessary tags (there are several of them, but only one will be given here):
All tags have the same structure: they always start with
<meta property =
and end with>
in quotes, immediately afterproperty, the
name of this property, and in thecontent
attribute - the content.Copy the content into the regular expression tester using the button of the same name. Based on the analysis from the previous step, create a regular line -
(?<=<meta\ property=)"([a-z:]+)"\s+content="(.*?)"(?=>)
With an action Text processing and its Regex actions, we get the values we need from the page code and save them to the table:
ย
Small explanations for the screenshot:
{-Page.Dom-}
- this variable stores the DOM of the tab. For source code, this is{-Page.Source-},
for text-{-Page.Text-}.
You can find others in the variables window .Why was column zero been excluded? Bracket group was used in the regular expression ((?<=<meta\ property=)"([a-z:]+)"\s+content="(.*?)"(?=>) - two groups are highlighted in red). When testing in the regular expression tester, going to the Groups tab , you will notice that three groups were found, despite the fact that we have two of them: the very first group contains the full match text, and then the groups that have been defined follow. And since the numbering starts from zero, we exclude exactly the column with the number 0, not 1.