Last active
September 19, 2024 10:19
-
-
Save cmalven/1885287 to your computer and use it in GitHub Desktop.
Shortest (useful) HTML5 Document
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!-- http://www.brucelawson.co.uk/2010/a-minimal-html5-document/ --> | |
<!doctype html> | |
<html lang=en> | |
<head> | |
<meta charset=utf-8> | |
<title>blah</title> | |
</head> | |
<body> | |
<p>I'm the content</p> | |
</body> | |
</html> |
Why does everyone include
<meta charset="utf-8">
?
UTF-8 is the only valid encoding for HTML5 documents. Means if you have <!DOCTYPE html>
at the top of an HTML file then charset is implied
if we're talking about a minimal HTML document, it's certainly not required - but doesn't really seem like a good idea either way.
Now imagine serving files with different encodings off the same web server.
Encoding is inalienable from the document itself. You can't usefully change the encoding without changing the text.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Modern authoring tools, IDEs, editors, etc. will surely assume UTF-8 as default. But imagine something like Leftpad, i.e. something deep inside the dependency graph of your complex tool and module chain (= out of your awareness) that breaks everything esp. on higher levels just because you had no charset declaration, maybe because it was added as a dependency - by another dependency you are not even aware you have it - before the whole world was UTF-8 or by any other reason setting ISO-8859-1 as default (the parallel to the Leftpad disaster is being killed by something you not even knew about). It's trivial to set a charset declaration before trouble arises, it may be very time consuming to find out that a missing charset declaration was the cause. Note that such a dev tool dependency will also hit you if you will never need any non-ASCII character.
Often small companies (maybe your clients) have no control themselves on the - outdated - server space they bought some years ago from someone who bought it themself from someone other etc. - you may have to do reverse engineering to know about the HTTP headers that are sent, and you cannot change them, so it's quite nice that you are always safe if you have a charset declaration in your code. You may get hit on a server just after years if there arises a new requirement for e.g. a French version and out of the blue there are strange replacements just because of quotation marks that are not in your implicit default charset.
Anyway, you will always have better arguments for stakeholders when things break although you adhered to standards.