Skip to content

Instantly share code, notes, and snippets.

@illicitonion
Created March 22, 2012 23:10
Show Gist options
  • Select an option

  • Save illicitonion/2165358 to your computer and use it in GitHub Desktop.

Select an option

Save illicitonion/2165358 to your computer and use it in GitHub Desktop.
W3C Browser Automation Spec - Rendering text draft 1
<section>
<h2>Rendering Text</h2>
<p>All WebDriver implementations must support getting the readable[1] text of a WebElement, with excess whitespace compressed. The expected return value is roughly what a text-only browser such as Lynx would display. The algorithm for determining this text is as follows:</p>
Initially, lines = [];
1: For each child of node, at time of execution, in order:
Get whitespace, text-transform, and then, if child is...
an ignored node [2]:
Do nothing.
a text node [http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#text]:
Let text = the nodeValue property of the node
Remove any zero-width spaces (\u200b), form feeds (\f) or vertical tab feeds (\v) from text
Canonicalise any recognised single newline sequence in text to a single newline (greedily matching (\r\n|\r|\n) -> a single \n)
If the parent's effective CSS whitespace style is 'normal' or 'nowrap':
Replace each newline (\n) with a single space character [6]
If the parent's effective CSS whitespace style is 'pre' or 'pre-wrap':
Replace each horizontal whitespace character [7] with a non-breaking space character (\xa0)
Otherwise:
Replace each sequence of horizontal whitespace characters [7] except non-breaking spaces (\xa0) with a single space character
Apply the parent's effective CSS text-transform style as per the CSS2 specification [http://www.w3.org/TR/CSS2/text.html#propdef-text-transform]
If last(lines) ends with a space character and text starts with a space character,:
Trim the first character of text
Append text to last(lines) in-place
an element [http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#element] which is not ignored:
If element is a:
BR element: Push '' to lines and stop
Block-level [5] element: If last(lines) is not '', push '' to lines.
And then:
Recurse depth-first to step 1 with node set to the current element
If element is a TD element, or the effective CSS display style is 'table-cell', and last(lines) is not '', and last(lines) does not end with whitespace:
Append a single space character to last(lines) [Note: Most innerText implementations append a \t here]
If element is a block-level element: push '' to lines
2: For each line in lines:
Trim any leading and trailing whitespace excluding non-breaking space characters [4].
3: Let s = lines.join('\n')
4: Trim any leading and trailing whitespace excluding non-breaking space characters [4] from s.
5: Replace any non-breaking spaces (\xa0) with spaces (\x20) in s.
6: Return s.
<p>1: readable: If the element were entirely displayed in the viewport, text is readable if it is discernable to a user, and has sufficient contrast that it does not appear as the background.</p>
<p>2: an ignored node is one which is not considered to be displayed, according to WebElement#isDisplayed</p>
<p>3: whitespace is defined by the ECMAScript regular expression [^\S]</p>
<p>4: whitespace excluding non-breaking space characters is defined by the ECMAScript regular expression [^\S\xa0]</p>
<p>5: A block-level element is one which is not a table cell, and whose effective CSS display style is not in ['inline', 'inline-block', 'inline-table', 'none', 'table-cell', 'table-column', 'table-column-group']</p>
<p>6: A space character is the character \x20</p>
<p>7: Horizontal whitespace characters are defined by the ECMAScript regular expression [\x20\t\u2028\u2029]
</section>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment