Skip to content

Instantly share code, notes, and snippets.

@dfkaye
Last active January 12, 2024 00:43
Show Gist options
  • Select an option

  • Save dfkaye/2e631eecb875bacedb4863e11de539e3 to your computer and use it in GitHub Desktop.

Select an option

Save dfkaye/2e631eecb875bacedb4863e11de539e3 to your computer and use it in GitHub Desktop.
XPath in JavaScript, notes on document.evaluate() et al.

10 October 2021

Placeholder for

XPath in JavaScript post.

  • document.evaluate is not supported in Internet Explorer.
  • document.createExpression is not supported in Internet Explorer.

See https://stackoverflow.com/questions/6729688/is-xpath-supported-in-internet-explorer for XPath support in Internet Explorer, and then forget about it. Internet Explorer is deprecated.

Some content copied and modified from Mozilla Developer Network documentation. See https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

See also the XPath Cheat Sheet at https://devhints.io/xpath.

Some ask on social media why browser-supported XPath is almost unknown among front-end web developers. The answer may involve the number steps to take and problems to avoid when creating and evaluating an XPath expression, beginning with...

  1. Namespace resolver.

First we need the correct context node or document object:

var contextNode = document.ownerDocument == null
  ? document.documentElement
  : document.ownerDocument.documentElement;

The context node serves as the root. Now we can create the namespace resolver based on that root.

var nsResolver = document.createNSResolver(contextNode);
  1. Create an XPath expression.
var expression = document.createExpression(`//input[@type="text"]`);
  1. Call evaluate on the expression, passing the context node to query into.
var result = expression.evaluate(document.body);
  1. Iterate result set.
var node;
while (node = result.iterateNext()) {
  console.log( node );
}
  1. Specify an ordered result.
var result = expression.evaluate(document.body, XPathResult.ORDERED_NODE_ITERATOR_TYPE);
console.log( result.resultType );
// 5

Where things fail

Type mismatches

If the XPath expression returns a value rather than a collection, and we specify an iterator on the result type, then evaluation throws an error.

document.evaluate("count(//body)", document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE)

// Uncaught TypeError: Document.evaluate: Result type mismatch

An XPathResult will populate type fields for values, namely, booleanValue, numberValue, and stringValue.

Ideally types should match. An XPathResult containing a number will expose a numberValue field containing the computed number value.

var result = document.evaluate("count(//body)", document);

// result.numberValue: 1

Type definitions are strictly enforced, resulting in a serious blunder in the design of the XPathResult object.

Accessing value properties of a type that does not match the returned value type will throw an error.

In the result containing a number, the booleanValue and stringValue properties will result in errors when accessed.

result.stringValue
// TypeError: XPathResult.stringValue getter: Result is not a string

However, type coercion is allowed by setting the result type parameter.

We can coerce the count of elements to a boolean:

var result = document.evaluate("count(//body)", document, null, XPathResult.BOOLEAN_TYPE);

// result.booleanValue: true

Or to a string:

var result = document.evaluate("count(//body)", document, null, XPathResult.STRING_TYPE);

// result.stringValue: "1"

Context

explanation of these examples TBD
document.evaluate("/body", document).iterateNext()
// null

document.evaluate("./body", document).iterateNext()
// null

document.evaluate("//body", document).iterateNext()
// <body data-global="important" dir="rtl">

Missing attributes in markup

HTML tags missing optional (default) attributes, such as <input name=..."> which defaults to a text type node.

var expr = document.createExpression(`//input[@type="text"]`)
// null

This is also true of CSS selectors, by the way

var inputs = document.querySelector(`[type="text"]`);
// null

Boolean operators to the rescue

Use the or and not operators to find text elements with and without the type attribute:

// 11 January 2024
// find input elements of type text or without type attribute.

var expression = document.createExpression(`//input[@type="text" or not(@type)]`);
var query = expression.evaluate(document.body, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE);
var result = [];

for (let i = 0, {snapshotLength} = query; i < snapshotLength; i++) {
  result.push(query.snapshotItem(i));
}

console.warn(...result);

Console API

Firefox and Chrome at time of writing provide a $x() function that accepts an XPath string and returns an array of matching elements in source order, so we can use forEach to iterate each element.

The following returns an array of all <meta> elements in the document, and prints each element's name.

$x("//meta").forEach(function(elm) {
  console.log(elm.name)
});

XPath helper function

Added 17 August 2022

From @webreflection's "Whatever happened to XPath?" post. See https://webreflection.medium.com/what-happened-to-xpath-1409aa3dbd57

"Here a cross browser, and cross env function, that replicates what devtools $x(...) helper brings in:"

// basic XPath helper that works with JSDOM too
function X(Path, root = document) {
  const flag = XPathResult.ORDERED_NODE_SNAPSHOT_TYPE;
  const query = document.evaluate(Path, root, null, flag, null);
  const result = [];

  for (let i = 0, {snapshotLength} = query; i < snapshotLength; i++)
    result.push(query.snapshotItem(i));

  return result;
}

Sorting in XSLT

see also Sorting in XSLT, https://www.xml.com/pub/a/2002/07/03/transform.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment