XPath Selectors - Changedetection.io

XPath (XML Path Language) is a powerful query language for selecting nodes from XML and HTML documents. It offers more advanced capabilities than CSS selectors, including text node extraction, parent selection, and complex conditional logic.

How XPath Works

XPath uses path expressions to navigate through the hierarchical structure of XML/HTML documents. You can select elements, attributes, text nodes, and more using a syntax similar to file system paths. Key Benefits:

Extract text nodes directly (without HTML tags)
Navigate to parent elements
Use advanced conditional logic
Perfect for RSS/XML feeds
Support for regex matching

XPath Versions in changedetection.io

changedetection.io supports two XPath implementations:

XPath 2.0/3.0 (Default)
XPath 1.0

Prefix: xpath: (or no prefix for // syntax)Engine: elementpath libraryFeatures:

XPath 2.0 and 3.0 support
Better namespace handling
Automatic default namespace support for RSS/Atom feeds
Modern expression syntax

Example:

xpath://div[@class='price']/text()
//title/text()

Prefix: xpath1:Engine: lxml libraryFeatures:

XPath 1.0 standard
Requires local-name() for default namespaces
Compatible with many online XPath testers

Example:

xpath1://div[@class='price']/text()
xpath1://*[local-name()='title']/text()

Basic XPath Syntax

Selecting Elements

//div

Selects all <div> elements anywhere in the document.

/html/body/div

Selects <div> elements that are direct children of <body>.

Selecting by Attributes

//div[@class='price']

Selects divs with class="price".

//a[@href]

Selects all <a> elements that have an href attribute.

Extracting Text

//h1/text()

Extracts the text content of <h1> elements (without HTML tags).

//div[@id='content']//text()

Extracts all text within the content div.

Extracting Attributes

//img/@src

Extracts the src attribute from all images.

//meta[@property='og:price:amount']/@content

Extracts Open Graph price metadata.

Practical Examples

Monitor Product Price

<div class="product">
  <h2>Gaming Laptop</h2>
  <span class="price" data-value="1299.99">$1,299.99</span>
</div>

Extract Multiple Fields

//div[@class='product']/h2/text()
//span[@class='price']/text()
//div[@class='stock']/text()

Each XPath expression on a new line extracts different fields.

Monitor RSS Feed Items

XPath 2.0/3.0
XPath 1.0

Automatic default namespace handling:

//item/title/text()
//item/description/text()

Works directly with RSS feeds without namespace handling.

Requires local-name() for elements in default namespace:

xpath1://*[local-name()='item']/*[local-name()='title']/text()
xpath1://*[local-name()='item']/*[local-name()='description']/text()

Advanced Techniques

Conditional Selection

Contains Text

//div[contains(text(), 'In Stock')]

Selects divs containing “In Stock” text.

//h2[contains(@class, 'product')]/text()

Selects h2 elements where class contains “product”.

Logical Operators

//div[@class='product' and @data-available='true']

Selects divs matching BOTH conditions.

//span[@class='sale' or @class='discount']

Selects spans matching EITHER condition.

//div[@class='item' and not(@class='hidden')]

Selects items that are NOT hidden.

Position-based Selection

//li[1]

Selects the first <li> element.

//li[last()]

Selects the last <li> element.

//li[position() > 2]

Selects all <li> elements after the second one.

//span[@class='price']/parent::div

Selects the parent div of the price span.

//h2[@class='title']/following-sibling::p[1]

Selects the first paragraph following the title.

//span[@class='label']/preceding-sibling::input

Selects input elements before the label.

Regular Expression Matching

changedetection.io supports EXSLT regex functions:

//div[re:match(text(), 'Price: \$\d+\.\d{2}')]

Matches divs with text matching the price pattern.

//a[re:test(@href, 'product/\d+')]

Selects links where href matches the pattern.

Working with Namespaces

RSS/Atom Feeds

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>My Feed</title>
    <item>
      <title>First Item</title>
      <description>Item description</description>
    </item>
  </channel>
</rss>

Handling CDATA Sections

Some RSS feeds wrap content in CDATA:

<description><![CDATA[<p>HTML content here</p>]]></description>

changedetection.io automatically processes CDATA sections. The XPath //description/text() will extract the content.

Combining Include and Remove Filters

//article[@class='main']

This extracts the main article while removing ads, sidebars, and scripts.

Remove filters must use the xpath: or xpath1: prefix explicitly.

Testing XPath Expressions

Using Browser DevTools

Open DevTools Console (F12)
Use $x() to test XPath:
```
$x('//div[@class="price"]/text()')
```
Verify the results

Online XPath Testers

XPath Tester
Code Beautify XPath
Use sample HTML/XML to test expressions

Online testers typically use XPath 1.0. Use xpath1: prefix in changedetection.io for compatibility.

Common Patterns

Pattern: Extract Plain Text

//div[@id='content']//text()

Use case: Get all text without HTML tags.

Pattern: Monitor Table Data

//table[@class='pricing']//tr[2]/td[3]/text()

Use case: Extract specific table cell (row 2, column 3).

Pattern: Get Meta Description

//meta[@name='description']/@content

Use case: Extract page meta description.

Pattern: Track Stock Status

//div[@class='availability' and contains(text(), 'In Stock')]

Use case: Check if “In Stock” appears.

Pattern: Extract Link URLs

//a[@class='download']/@href

Use case: Get download link URLs.

Common Pitfalls

Pitfall #1: Forgetting text()

//div[@class='price']

Returns the entire element with HTML tags.Better:

//div[@class='price']/text()

Extracts just the text content.

Pitfall #2: Case SensitivityXPath is case-sensitive!

//Div[@Class='Price']  # Wrong
//div[@class='price']  # Correct

Pitfall #3: Namespace Issues with RSSIf your XPath returns nothing from an RSS feed:Problem: Using XPath 1.0 without local-name()

xpath1://item/title/text()  # May fail

Solution 1: Use default XPath (2.0/3.0)

//item/title/text()  # Automatic namespace handling

Solution 2: Use local-name() with XPath 1.0

xpath1://*[local-name()='item']/*[local-name()='title']/text()

When to Use XPath

Good for:

Monitoring RSS/Atom feeds
Extracting text without HTML tags
Complex conditional filtering
Navigating to parent elements
XML documents
When you need regex matching
Extracting specific attributes

Not ideal for:

JSON APIs (use JSON filtering instead)
When CSS selectors are sufficient (simpler syntax)
When you want visual selector support

XPath vs CSS Selectors

Feature	XPath	CSS Selectors
Text extraction	`//div/text()`	Not possible
Parent selection	`//span/parent::div`	Not possible
Attribute extraction	`//@href`	Not directly
Conditional logic	`[contains(@class, 'x')]`	Limited
Visual selector	❌ Not available	✅ Available
RSS/XML feeds	✅ Excellent	❌ Not suitable
Learning curve	Steeper	Easier

Real-World Examples

Example: Monitor Product Reviews

//div[@itemprop='aggregateRating']//span[@itemprop='ratingValue']/text()

Extracts structured rating data from product pages.

Example: Track News Headlines from RSS

//item/title/text()
//item/pubDate/text()

Monitors RSS feed items for new headlines and dates.

Example: Extract JSON-LD Price

//script[@type='application/ld+json']/text()

Extracts JSON-LD structured data, which can then be filtered with JSON filters.

Example: Monitor Table Changes

//table[@class='data']//tr[position() > 1]/td[2]/text()

Extracts second column from all data rows (skipping header).

Example: Get All Links in a Section

//section[@id='downloads']//a/@href

Extracts all download link URLs from a specific section.

Debugging Tips

Returns Empty Results

Check if you’re using the right XPath version
For RSS/XML, try switching between xpath: and xpath1:
Use local-name() for namespaced elements
Verify element exists (check browser’s element inspector)
Test in browser console: $x('your-xpath-here')

Returns Unexpected Content

Add /text() to extract only text content
Use [1] or [last()] to get specific positions
Add more specific conditions with [@attribute='value']
Check if you need // (any level) vs / (direct child)

CSS Selectors - Simpler alternative for HTML content
JSON Filtering - Extract data from JSON responses
RSS Monitoring - Specific guide for RSS feeds

Documentation Index

​How XPath Works

​XPath Versions in changedetection.io

​Basic XPath Syntax

​Selecting Elements

​Selecting by Attributes

​Extracting Text

​Extracting Attributes

​Practical Examples

​Monitor Product Price

​Extract Multiple Fields

​Monitor RSS Feed Items

​Advanced Techniques

​Conditional Selection

​Parent and Sibling Navigation

​Regular Expression Matching

​Working with Namespaces

​RSS/Atom Feeds

​Handling CDATA Sections

​Combining Include and Remove Filters

​Testing XPath Expressions

​Using Browser DevTools

​Online XPath Testers

​Common Patterns

​Pattern: Extract Plain Text

​Pattern: Monitor Table Data

​Pattern: Get Meta Description

​Pattern: Track Stock Status

​Pattern: Extract Link URLs

​Common Pitfalls

​When to Use XPath

​XPath vs CSS Selectors

​Real-World Examples

​Debugging Tips

​Returns Empty Results

​Returns Unexpected Content

​Related Topics

How XPath Works

XPath Versions in changedetection.io

Basic XPath Syntax

Selecting Elements

Selecting by Attributes

Extracting Text

Extracting Attributes

Practical Examples

Monitor Product Price

Extract Multiple Fields

Monitor RSS Feed Items

Advanced Techniques

Conditional Selection

Parent and Sibling Navigation

Regular Expression Matching

Working with Namespaces

RSS/Atom Feeds

Handling CDATA Sections

Combining Include and Remove Filters

Testing XPath Expressions

Using Browser DevTools

Online XPath Testers

Common Patterns

Pattern: Extract Plain Text

Pattern: Monitor Table Data

Pattern: Get Meta Description

Pattern: Track Stock Status

Pattern: Extract Link URLs

Common Pitfalls

When to Use XPath

XPath vs CSS Selectors

Real-World Examples

Debugging Tips

Returns Empty Results

Returns Unexpected Content

Related Topics