-
-
Save cmalven/1885287 to your computer and use it in GitHub Desktop.
<!-- http://www.brucelawson.co.uk/2010/a-minimal-html5-document/ --> | |
<!doctype html> | |
<html lang=en> | |
<head> | |
<meta charset=utf-8> | |
<title>blah</title> | |
</head> | |
<body> | |
<p>I'm the content</p> | |
</body> | |
</html> |
what about
<nav> and <header> and <footer>
?
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>The HTML5 Herald</title>
<meta name="description" content="The HTML5 Herald">
<meta name="author" content="SitePoint">
<link rel="stylesheet" href="css/styles.css?v=1.0">
<!--[if lt IE 9]>
<script src="https://cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<script src="js/scripts.js"></script>
<header></header>
<nav></nav>
<main></main>
<footer></footer>
</body>
</html>
shortest valid HTML5 that passed http://validator.w3.org
<!doctype html><meta charset=utf-8><title>shortest html5</title>
To avoid Warning:
Consider adding a lang
attribute to the html
start tag to declare the language of this document.
<!doctype html><html lang="en"><meta charset=utf-8><title>shortest html5</title>
Why does everyone include <meta charset=utf-8>
?
According to validator.w3.org
, this is not required - and probably should be defined by a Content-Type
header anyway?
@mindplay-dk With "Validate by Direct Input" the HTTP header of validator.w3.org sets "content-type: text/html; charset=utf-8" on page invocation, so you don't need to define a charset in the code you type in. Indeed some design error.
Try "Validate by File Upload" with a local file without "meta charset" and get "Error: The character encoding was not declared".
Background: https://www.w3.org/TR/html5-diff/#character-encoding: "For the HTML syntax, Web developers are required to declare the character encoding." Possible ways to declare are inside HTTP header, BOM, or "meta charset". As a webdev you often have only "meta charset" under control.
Okay, but are we authoring for flat files on somebody's local filesystem, or for uploading to a webserver?
Declaring encoding in the file is dodgy, in my opinion:
For one, the potential conflict with Content-Type
creates a wonderful opportunity for confusion - if they don't match, the Content-Type
happens to win, which would be very confusing for someone trying to debug an encoding issue... "why the heck doesn't my tag work - it worked locally!"
For anoter, declaring the charset in this manner doesn't even work if the tag doesn't appear in the first 1kb of the document... which it very probably will, but, you know... just one of those things that could send somebody down a deep rabbithole when, for no apparent reason, the page goes bonkers because you added 1kb of JavaScript before that tag.
I mean, if we're talking about a minimal HTML document, it's certainly not required - but doesn't really seem like a good idea either way.
Modern authoring tools, IDEs, editors, etc. will surely assume UTF-8 as default. But imagine something like Leftpad, i.e. something deep inside the dependency graph of your complex tool and module chain (= out of your awareness) that breaks everything esp. on higher levels just because you had no charset declaration, maybe because it was added as a dependency - by another dependency you are not even aware you have it - before the whole world was UTF-8 or by any other reason setting ISO-8859-1 as default (the parallel to the Leftpad disaster is being killed by something you not even knew about). It's trivial to set a charset declaration before trouble arises, it may be very time consuming to find out that a missing charset declaration was the cause. Note that such a dev tool dependency will also hit you if you will never need any non-ASCII character.
Often small companies (maybe your clients) have no control themselves on the - outdated - server space they bought some years ago from someone who bought it themself from someone other etc. - you may have to do reverse engineering to know about the HTTP headers that are sent, and you cannot change them, so it's quite nice that you are always safe if you have a charset declaration in your code. You may get hit on a server just after years if there arises a new requirement for e.g. a French version and out of the blue there are strange replacements just because of quotation marks that are not in your implicit default charset.
Anyway, you will always have better arguments for stakeholders when things break although you adhered to standards.
Why does everyone include
<meta charset="utf-8">
?
UTF-8 is the only valid encoding for HTML5 documents. Means if you have <!DOCTYPE html>
at the top of an HTML file then charset is implied
if we're talking about a minimal HTML document, it's certainly not required - but doesn't really seem like a good idea either way.
Now imagine serving files with different encodings off the same web server.
Encoding is inalienable from the document itself. You can't usefully change the encoding without changing the text.
Interesting: https://google.github.io/styleguide/htmlcssguide.html#Optional_Tags