XPath
Please read the Terms of Use for Materials on ZennoLab
Table of contents
It is a flexible and powerful language for querying xml or (x) html elements of a document and xslt transformations over the DOM, which is a W3C standard .
What is XPath in ZennoPoster for?
For parsing sites (action Parse data)
To find and interact with elements on a web page
Can be used in the Action Designer
Using XPath, you can implement a more versatile and robust data search algorithm in comparison with regular expressions. This query language allows you to significantly simplify the logic of parsers and thereby speed up their development.
Testing queries as they are composed
ZennoPoster has a built-in X\Json Path Tester with which you can test the composed expression.
You can also compose and test an XPath expression in the DevTools window: open the DevTools window, press ctrl + f to call the search bar and enter the XPath expression into it:
For example, to get the names of events on w3.org , we can use the following expression:
//*[@id="w3c_home_upcoming_events"]/ul/li//a
Basic syntax
Paths
Expression | Description |
---|---|
. | current context |
.// | recursive descent (zero or more levels from the current context) |
/html/body | absolute path |
a | relative path |
//* | everything in the current context |
li/*/a | links that are "grandchildren" for li |
//a|//button | links and buttons (union of two sets of nodes) |
Relations
Expression | Description |
---|---|
a/i/parent::p | immediate parent <p> |
p/ancestor::* | all parents |
p/following-sibling::* | all next brothers |
p/preceding-sibling::* | all previous brothers |
p/following::* | all of the following elements except descendants |
p/preceding::* | all previous elements except ancestors |
p/descendant-or-self::* | context node and all its descendants |
p/ancestor-or-self::* | context node and all its ancestors |
Getting nodes
Expression | Description |
---|---|
/div/text() | get text nodes |
/div/text()[1] | get the first text node |
Item position
Expression | Description |
---|---|
a[1] | first element |
a[last()] | last element |
a[2] | second link |
a[position() <= 3] | first 3 links |
ul[li[1]=”OK”] | list (UL) whose first element contains the value 'OK' |
tr[position() mod 2 = 1] | odd elements |
tr[position() mod 2 = 0] | even elements |
p/text()[2] | second text node |
Attributes and Filters
[] - indicates filtering items
Expression | Description |
---|---|
input[@type=”text”] | <input> tag with type attribute equal to text |
input[@class='OK'] | <input> tag whose class attribute is OK |
p[not(@*)] | paragraphs without attributes |
*[@style] | all elements with style attribute |
a[. = “OK”] | links with the value "OK" |
a/@id | link identifiers |
a/@* | all link attributes |
| links that contain id and rel attributes |
a[i or b] | links contain an <i> or <b> element |
Functions
Basic Xpath functions - http://www.w3.org/TR/xpath/#corelib
Function | Description | Example |
---|---|---|
name() | Returns the name of the element | [name()='a'] |
string(val) | Get attribute value | string(a[1]/@id) |
substring(val, from, to) | Cut part of a line | substring(@id, 1, 6) |
substring-before(val, to) | Return the part of the string val before the string to | substring-before('12-May-1998', '-') = '12' |
substring-after(val, from) | Return part of string val after string to | substring-after('12-May-1998', '-') = 'May-1998' |
string-length() | Returns the number of characters in a string | [string-length(text()) > 5] |
count() | Returns the number of items |
|
concat() | Takes two or more strings as input and returns the concatenation (string addition) of its arguments. |
|
normalize-space() | Analog Trim | [normalize-space(text())='SEARCH'] |
starts-with() | Starts with | [starts-with(text(), 'SEARCH')] |
contains() | Contains | [contains(name(), 'SEARCH')] |
translate(val, from, to) | Replaces the characters of its first string argument, which are present in the second argument, with the corresponding characters of the third argument. | translate(«bar»,«abc»,«ABC») |
Grouping
Expression | Description |
---|---|
(table/tbody/tr)[last()] | last <tr> row from all tables |
(//h1|//h2)[contains(text(), 'Text')] | a first or second level heading that contains "Text" |
a[//tr/@data-id=@data-id] | all links whose data-id attribute matches the same attribute for a table row |
Useful links