Created
November 30, 2012 11:25
-
-
Save OllyHodgson/4175212 to your computer and use it in GitHub Desktop.
While testing a system, we found some "interesting" HTML encoding behaviour. For a given value of "interesting", obviously.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!doctype html> | |
<html> | |
<head> | |
<meta charset="utf-8"> | |
<title>Escaping fun!</title> | |
</head> | |
<body> | |
<!-- Renders as 'abcd' --> | |
<p>abcd<efg</p> | |
<!-- Renders as 'abcd<&efg' --> | |
<p>abcd<&efg</p> | |
</body> | |
</html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The first paragraph is rendered as "abcd". Presumably the browser sees everything after the less than symbol as part of an HTML tag. Put an ampersand immediately after it though, and it renders the whole paragraph to the page. I suspect it's because it knows tags cannot start with an ampersand, or sees it as an invalid entity, so outputs it as plain text.