Last active
September 15, 2019 04:19
-
-
Save stemar/5794bd48576ae6adf405c6f21ce7dab0 to your computer and use it in GitHub Desktop.
UTF-8 HTML5-compatible Tidy output
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <?php | |
| function tidy_html5($html, array $config = [], $encoding = 'utf8') { | |
| $config += [ | |
| 'doctype' => '<!DOCTYPE html>', | |
| 'drop-empty-elements' => 0, | |
| 'new-blocklevel-tags' => 'article aside audio bdi canvas details dialog figcaption figure footer header hgroup main menu menuitem nav section source summary template track video', | |
| 'new-empty-tags' => 'command embed keygen source track wbr', | |
| 'new-inline-tags' => 'audio command datalist embed keygen mark menuitem meter output progress source time video wbr', | |
| 'tidy-mark' => 0, | |
| ]; | |
| $html = tidy_parse_string($html, $config, $encoding); // doctype not inserted | |
| tidy_clean_repair($html); // doctype inserted | |
| return $html; | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <?php | |
| $html = '</z><p><a href="#">Link</a></p><p><img src="logo.png"/>Seçond para</p><i class="fa"></i><p></p><wbr>'; | |
| echo tidy_html5($html, ['indent'=>2, 'indent-spaces'=>4]); | |
| /* | |
| <!DOCTYPE html> | |
| <html> | |
| <head> | |
| <title></title> | |
| </head> | |
| <body> | |
| <p><a href="#">Link</a></p> | |
| <p><img src="logo.png">Seçond para</p><i class="fa"></i> | |
| <p></p><wbr> | |
| </body> | |
| </html> | |
| */ |
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Use
LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIEDas option inDOMDocument::loadHTML()to load the HTML without the<!DOCTYPE>and<html>tags so thattidy_html5()can insert them.