Skip to content

Instantly share code, notes, and snippets.

@yohgaki
Last active June 8, 2017 01:53
Show Gist options
  • Save yohgaki/e518898ffda2fe37ab911d7a7fcb1a9f to your computer and use it in GitHub Desktop.
Save yohgaki/e518898ffda2fe37ab911d7a7fcb1a9f to your computer and use it in GitHub Desktop.
hash_hkdf() manual improvement
Index: en/reference/hash/functions/hash-hkdf.xml
===================================================================
--- en/reference/hash/functions/hash-hkdf.xml (リビジョン 342317)
+++ en/reference/hash/functions/hash-hkdf.xml (作業コピー)
@@ -3,7 +3,7 @@
<refentry xml:id="function.hash-hkdf" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<refnamediv>
<refname>hash_hkdf</refname>
- <refpurpose>Generate a HKDF key derivation of a supplied key input</refpurpose>
+ <refpurpose>Derive secure new key from existing key by using HKDF</refpurpose>
</refnamediv>
<refsect1 role="description">
&reftitle.description;
@@ -16,6 +16,43 @@
<methodparam choice="opt"><type>string</type><parameter>salt</parameter><initializer>''</initializer></methodparam>
</methodsynopsis>
+ <para>
+ RFC 5869 defines HKDF (HMAC based Key Derivation Function) which is
+ general purpose KDF. HKDF could be useful for many PHP applications
+ that require temporary keys for authentication, authorization and
+ encryption. e.g. CSRF token, pre-signed URI, password for password
+ protected URI, and so on.
+ </para>
+ <para>
+ <function>hash_hkdf</function> can derive strong(secure) key from
+ weak key. However, it is designed to compute hash values
+ efficiently. Therefore, it is weaker to brute force attack compare
+ to <function>hash_password</function> or
+ <function>hash_pbkdf2</function> which is a KDF designed for
+ storing passwords. Unless weak key(password) and strong secret salt is
+ stored in other secure system, user should use
+ <function>hash_password</function> or
+ <function>hash_pbkdf2</function> for storing weak
+ key(password) hashes.
+ </para>
+ <note>
+ <para>
+ When info and length
+ is not required for your program, more efficient
+ <function>hash_hmac</function> could be used instead.
+ </para>
+ <para>
+ String concatination with hashes involves security
+ risks. <literal>hash('sha3-512', 'my_secret'
+ . 'some_salt')</literal> is less secure than
+ <literal>hash_hmac('sha3-512', 'my_secret',
+ 'some_salt')</literal>. For the same reason,
+ <literal>hash_hmac('sha3-512', 'my_secret', 'some_salt'
+ . 'some_info')</literal> is less secure than
+ <literal>bin2hex(hash_hkdf('sha3-512', 'my_secret', 0,
+ 'some_info', 'some_salt'))</literal>.
+ </para>
+ </note>
</refsect1>
<refsect1 role="parameters">
&reftitle.parameters;
@@ -25,7 +62,7 @@
<term><parameter>algo</parameter></term>
<listitem>
<para>
- Name of selected hashing algorithm (i.e. "sha256", "sha512", "haval160,4", etc..)
+ Name of selected hashing algorithm (i.e. "sha3-256", "sha3-512", "sha256", "sha512", "haval160,4", etc..)
See <function>hash_algos</function> for a list of supported algorithms.
<note>
<para>
@@ -39,7 +76,7 @@
<term><parameter>ikm</parameter></term>
<listitem>
<para>
- Input keying material (raw binary). Cannot be empty.
+ Input keying material. Cannot be empty.
</para>
</listitem>
</varlistentry>
@@ -60,7 +97,10 @@
<term><parameter>info</parameter></term>
<listitem>
<para>
- Application/context-specific info string.
+ Application/context-specific info string. Info is intended for
+ public information such as user ID, protocol version, etc.
+ Info does not affects input key security. User may specify
+ whatever information required.
</para>
</listitem>
</varlistentry>
@@ -68,11 +108,46 @@
<term><parameter>salt</parameter></term>
<listitem>
<para>
- Salt to use during derivation.
+ Salt to use during derivation. Optimal salt size is equal to
+ used hash function size. e.g. 32 bytes for SHA3-256.
</para>
<para>
- While optional, adding random salt significantly improves the strength of HKDF.
+ Unlike <parameter>info</parameter> which is not supposed to be
+ a security related parameter. <parameter>salt</parameter> is
+ directly related to <parameter>ikm</parameter> and derived
+ key security. If <parameter>salt</parameter> is omitted, null chars
+ equal to chosen hash function size is used. Therefore, empty
+ salt will not add protection for the input key material.
</para>
+ <para>
+ While optional, adding random salt significantly improves the
+ strength of HKDF. Salt could be either secret or
+ non-secret. It is used as "Pre Shared Key" in many use cases.
+ Strong value is preferred. e.g. Use <function>random_bytes</function>.
+ Optimal salt size is size of used hash algorithm.
+ </para>
+ <warning>
+ <para>
+ Although salt is the last optional parameter, salt is the
+ most important parameter for key security. Omitted salt is
+ indication of inappropriate design in most cases. Users must
+ set appropriate salt value whenever it is possible. Omit salt
+ only when it cannot be used.
+ </para>
+ <para>
+ Strong salt is mandatory and must be kept secret when input
+ key is weak, otherwise input key security will not be kept.
+ When input key is strong, low entropy salt is acceptable.
+ However, providing strong salt is the best practice for the
+ best possible key security. Strong salt is strongly recommended
+ long life input keys.
+ </para>
+ <para>
+ Salt must not be able to be controlled by users. i.e. User
+ must not be able to set salt value and get derived key. User
+ controlled salt allows input key analysis to attackers.
+ </para>
+ </warning>
</listitem>
</varlistentry>
</variablelist>
@@ -101,6 +176,100 @@
&reftitle.examples;
<para>
<example>
+ <title>URI specific CSRF token that supports expiration by <function>hash_hkdf</function></title>
+ <programlisting role="php">
+<![CDATA[
+<?php
+define('CSRF_TOKEN_EXPIRE', 180); // CSRF token expiration
+define('CSRF_TOKENS', 5); // Last 5 CSRF tokens are valid
+
+/**************************************
+ * Implementation note
+ *
+ * It uses "counter" for CSRF expiration management.
+ * "counter" is very low entropy, but input key is strong and
+ * CSRF_TOKEN_SEED is short term key. It should be OK.
+ *
+ * This CSRF token implementation has pros and cons
+ *
+ * Pros
+ * - A CSRF token is valid only for specific URI.
+ * - No database is required for URI specific CSRF tokens.
+ * - Only CSRF token is required. i.e. No timestamp parameter.
+ * - When user is active, a CSRF token is valid upto CSRF_TOKEN_EXPIRE * CSRF_TOKENS sec.
+ * - Even when user had long idle time, CSRF token is valid.
+ * - CSRF token will expire eventually.
+ * - Invalidating all active CSRF tokens could be done by unset($_SESSION['CSRF_TOKEN_SEED']).
+ * It is recommended to reset CSRF tokens by login/logout event at least.
+ * It may be good idea to invalidate all of older CSRF tokens when idle time is long.
+ *
+ * Cons
+ * - There could be no CSRF expiration time.
+ *
+ * Precise CSRF token expiration is easy. Just add timestamp parameter
+ * as "info" and check it.
+ **************************************/
+
+session_start();
+if (empty($_SESSION['CSRF_TOKEN_SEED'])) {
+ $_SESSION['CSRF_TOKEN_SEED'] = random_bytes(32);
+ $_SESSION['CSRF_TOKEN_SALT'] = random_bytes(32);
+ $_SESSION['CSRF_TOKEN_COUNT'] = 1;
+ $_SESSION['CSRF_TOKEN_EXPIRE'] = time();
+}
+
+
+function csrf_get_token($uri) {
+ // Check expiration
+ if ($_SESSION['CSRF_TOKEN_EXPIRE'] + CSRF_TOKEN_EXPIRE < time()) {
+ $_SESSION['CSRF_TOKEN_COUNT']++;
+ $_SESSION['CSRF_TOKEN_EXPIRE'] = time();
+ }
+ // Equivalent(NOT exactly the same) value by using hash_hmac()
+ // return hash_hmac('sha3-256', hash_hmac('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], $_SESSION['CSRF_TOKEN_SALT'] . $_SESSION['CSRF_TOKEN_COUNT']), $uri);
+ return bin2hex(hash_hkdf('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], 0, $uri, $_SESSION['CSRF_TOKEN_SALT'] . $_SESSION['CSRF_TOKEN_COUNT']));
+}
+
+function csrf_validate_token($csrf_token, $uri) {
+ for($i = 0; $i < CSRF_TOKENS; $i++) {
+ // Equivalent(NOT exactly the same) value by using hash_hmac()
+ // $token = hash_hmac('sha3-256', hash_hmac('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], $_SESSION['CSRF_TOKEN_SALT'] . ($_SESSION['CSRF_TOKEN_COUNT'] - $i)), $uri);
+ $token = bin2hex(hash_hkdf('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], 0, $uri, $_SESSION['CSRF_TOKEN_SALT'] . ($_SESSION['CSRF_TOKEN_COUNT'] - $i)));
+ if (hash_equals($csrf_token, $token)) {
+ return TRUE;
+ }
+ }
+ return FALSE;
+}
+
+
+//// Generating CSRF token ////
+// $uri is target URI that browser POSTs form data
+$uri = 'https://example.com/some_form/';
+$csrf_token = csrf_get_token($uri);
+// embed $csrf_token to your form
+
+//// Validating CSRF token ////
+$csrf_token = $_POST['csrf_token'] ?? '';
+if (!csrf_validate_token($csrf_token, $_SERVER['REQUEST_URI'])) {
+ // Invalid CSRF token
+ throw new Exception('CSRF token validation error');
+}
+// valid request
+?>
+]]>
+ </programlisting>
+ <para>
+ Common CSRF token uses the same token value for a session and all
+ URI. This example CSRF token expires and is specific to a
+ URI. i.e. CSRF token http://example.com/form_A/ is not valid for
+ http://example.com/form_B/ Since token value is computed, no
+ database is required.
+ </para>
+ </example>
+ </para>
+ <para>
+ <example>
<title><function>hash_hkdf</function> example</title>
<programlisting role="php">
<![CDATA[
@@ -124,6 +293,57 @@
</para>
</example>
</para>
+ <para>
+ <example>
+ <title><function>hash_hkdf</function> bad example</title>
+ <para>
+ Users must not simply extend input key material length. HKDF does
+ not add additional entropy automatically. Therefore, weak key
+ remains weak unless strong salt is supplied. Following is bad
+ example.
+ </para>
+ <programlisting role="php">
+<![CDATA[
+<?php
+$inputKey = get_my_aes128_key(); // AES 128 bit key
+
+// Derive AES 256 key from AES 128 key
+$encryptionKey = hash_hkdf('sha256', $inputKey, 32, 'aes-256-encryption');
+// Users should not do this. $encryptionKey only has 128 bit
+// entropy while it should have 256 bit entropy.
+// To derive strong AES 256 key, strong enough salt is required.
+
+// Good example
+// $salt = random_bytes(32); // Keep this somewhere. Secret salt is prefered if it is possible.
+// $encryptionKey = hash_hkdf('sha256', $inputKey, 32, 'aes-256-encryption', $salt);
+?>
+]]>
+ </programlisting>
+ <para>
+ HKDF will not provide input and derived key security automatically.
+ Users are responsible for key security. Users must supply proper
+ salt values.
+ </para>
+ <programlisting role="php">
+<![CDATA[
+<?php
+// Derive key from user entered passowrd
+$derivedKey = hash_hkdf('sha3-256', $password);
+// Users should not do this. Above is almost the same as
+$derivedKey = hash_hmac('sha3-256', $password, '');
+$derivedKey = hash('sha3-256', $password);
+// Attackers can perform simple brute force attack.
+
+// Good example
+// User entered password is extremely weak usually. Secret salt
+// is mandatory.
+// $salt = random_bytes(32); // This this somewhere and secret
+// $derivedKey = hash_hkdf('sha3-256', $password, 0, '', $salt);
+?>
+]]>
+ </programlisting>
+ </example>
+ </para>
</refsect1>
<refsect1 role="seealso">
@@ -130,6 +350,7 @@
&reftitle.seealso;
<para>
<simplelist>
+ <member><function>hash_hmac</function></member>
<member><function>hash_pbkdf2</function></member>
<member><link xlink:href="&url.rfc;5869">RFC 5869</link></member>
<member><link xlink:href="&url.git.hub;narfbg/hash_hkdf_compat">userland implementation</link></member>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment