XPath

Table of contents


It is a flexible and powerful language for querying xml or (x) html elements of a document and xslt transformations over the DOM, which is a W3C standard .

What is XPath in ZennoPoster for?

Using XPath, you can implement a more versatile and robust data search algorithm in comparison with regular expressions. This query language allows you to significantly simplify the logic of parsers and thereby speed up their development.

Testing queries as they are composed

ZennoPoster has a built-in X\Json Path Tester with which you can test the composed expression.

You can also compose and test an XPath expression in the DevTools window: open the DevTools window, press ctrl + f to call the search bar and enter the XPath expression into it:

For example, to get the names of events on w3.org , we can use the following expression:

//*[@id="w3c_home_upcoming_events"]/ul/li//a

Basic syntax

Paths

Expression

Description

Expression

Description

.

current context

.//

recursive descent (zero or more levels from the current context)

/html/body

absolute path

a

relative path

//*

everything in the current context

li/*/a

links that are "grandchildren" for li

//a|//button

links and buttons (union of two sets of nodes)

Relations

Expression

Description

Expression

Description

a/i/parent::p

immediate parent <p>

p/ancestor::*

all parents

p/following-sibling::*

all next brothers

p/preceding-sibling::*

all previous brothers

p/following::*

all of the following elements except descendants

p/preceding::*

all previous elements except ancestors

p/descendant-or-self::*

context node and all its descendants

p/ancestor-or-self::*

context node and all its ancestors

Getting nodes

Expression

Description

Expression

Description

/div/text()

get text nodes

/div/text()[1]

get the first text node

Item position

Expression

Description

Expression

Description

a[1]

first element

a[last()]

last element

a[2]

second link

a[position() <= 3]

first 3 links

ul[li[1]=”OK”]

list (UL) whose first element contains the value 'OK'

tr[position() mod 2 = 1]

odd elements

tr[position() mod 2 = 0]

even elements

p/text()[2]

second text node

Attributes and Filters

[] - indicates filtering items

Expression

Description

Expression

Description

input[@type=”text”]

<input> tag with type attribute equal to text

input[@class='OK']

<input> tag whose class attribute is OK

p[not(@*)]

paragraphs without attributes

*[@style]

all elements with style attribute

a[. = β€œOK”]Β 

links with the value "OK"

a/@id

link identifiers

a/@*

all link attributes

  • a[@id and @rel]

  • a[@id][@rel]

links that contain id and rel attributes

a[i or b]

links contain an <i> or <b> element

Functions

Basic Xpath functions - http://www.w3.org/TR/xpath/#corelib

Function

Description

Example

Function

Description

Example

name()

Returns the name of the element

[name()='a']

string(val)

Get attribute value

string(a[1]/@id)

substring(val, from, to)

Cut part of a line

substring(@id, 1, 6)

substring-before(val, to)

Return the part of the string val before the string to

substring-before('12-May-1998', '-') = '12'

substring-after(val, from)

Return part of string val after string to

substring-after('12-May-1998', '-') = 'May-1998'

string-length()

Returns the number of characters in a string

[string-length(text()) > 5]

count()

Returns the number of items

Β 

concat()

Takes two or more strings as input and returns the concatenation (string addition) of its arguments.

Β 

normalize-space()Β 

Analog Trim

[normalize-space(text())='SEARCH']

starts-with()

Starts with

[starts-with(text(), 'SEARCH')]

contains()

Contains

[contains(name(), 'SEARCH')]

translate(val, from, to)

Replaces the characters of its first string argument, which are present in the second argument, with the corresponding characters of the third argument.

translate(Β«barΒ»,Β«abcΒ»,Β«ABCΒ»)

Grouping

Expression

Description

Expression

Description

(table/tbody/tr)[last()]

last <tr> row from all tables

(//h1|//h2)[contains(text(), 'Text')]

a first or second level heading that contains "Text"

a[//tr/@data-id=@data-id]

all links whose data-id attribute matches the same attribute for a table row

Useful links

Β