Last active
April 15, 2023 23:53
-
-
Save webstrand/8631fec9cfa0270e75c11ef9c176db5d to your computer and use it in GitHub Desktop.
Match HTML entities including idiosyncratic entities missing a trailing semicolon
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| &(?: | |
| \w+;| | |
| (?:AE|ae|sz)lig| | |
| [Aa]ring| | |
| [AaEeIiOoUu](?:grave|circ)| | |
| [AaEeIiOoUuy]?(?:acute|uml)|Yacute| | |
| [AaNnOo]tilde| | |
| [AEeIiOoUu]irc| | |
| [Cc]?cedil| | |
| [lr]aquo| | |
| [Oo]slash| | |
| [GL]T| | |
| [gl]t| | |
| [dr]eg| | |
| frac(?:12|14|34)| | |
| ord[fm]| | |
| sup[123]| | |
| AMP| | |
| COPY| | |
| ETH| | |
| QUOT| | |
| REG| | |
| THORN| | |
| amp| | |
| brvbar| | |
| cent| | |
| copy| | |
| curren| | |
| divide| | |
| eth| | |
| iexcl| | |
| iquest| | |
| macr| | |
| micro| | |
| middot| | |
| nbsp| | |
| not| | |
| para| | |
| plusmn| | |
| pound| | |
| quot| | |
| sect| | |
| shy| | |
| thorn| | |
| times| | |
| yen | |
| ) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment