Skip to content

Instantly share code, notes, and snippets.

@lampeh
Created October 16, 2012 00:13
Show Gist options
  • Save lampeh/3896514 to your computer and use it in GitHub Desktop.
Save lampeh/3896514 to your computer and use it in GitHub Desktop.
Le perroquet: push updated data directly into varnish

Varnish key/value store

From "Smart Pre-Fetching: Varnish @ Yakaz.com by Pierre-Gilles Mialon, Yakaz.com"

Use varnish as a memcached with HTTP interface. Push data updates directly into the cache without having to buffer them outside varnish and pulling them through a backend.

  • no backend! the content is echoed by another varnish thread through vcl_error()
  • the content lives only in the varnish cache until it expires or the cache is cleared
  • gzip compression is handled by varnish. add "FC-Content-Encoding: gzip" to the request for client-side compression

PoC Limits:

  • default usable data size: ~5kB before base64-encoding, after optional client-side gzip
  • default total request header limit: 8kB (run-time parameter http_req_hdr_len). exceeding the limit results in "413 Request Entity Too Large"
  • default total request size limit: 32kB (run-time parameter http_req_size). exceeding the limit results in a connection reset: "11 SessionClose c blast"
  • default session workspace limit: 64kB (run-time parameter sess_workspace). should be several times larger than the base64 data
  • binary data requires either digest.synthetic_base64_decode (a frankenfunction built from base64_decode and null.synth) or libvmod-null and a FC-Data-Length header
  • the request muss not be "pass"ed in VCL or varnish won't cache the response
  • cache misses without Forcecontent header will be sent to the default backend unless handled elsewhere in your VCL
  • request body is not passed to the backend through a "miss", therefore special request headers must transport the base64-encoded content
  • request header size is always limited. future VCL body access could make it work with even larger objects regardless of http_req_(size|hdr_len).

Setup

VCL

import std;
import digest;

## point to local varnish instance
backend localvarnish {
        .host = "127.0.0.1";
        .port = "80";
}

sub vcl_recv {
        if (req.http.Forcecontent) {
                ## check signature. use hmac_sha256 instead of hash_sha256 if possible
                ## TODO: make FC-Content-Type and FC-Cache-Control optional, include in signature if present
                if (req.http.FC-Auth && req.http.FC-TS && req.http.FC-Content-Type &&
                        (now - std.duration(req.http.FC-TS + "s", 0s)) < 300s &&
                        req.http.FC-Auth == digest.hash_sha256(req.http.FC-TS + req.http.Forcecontent + "s3cretf00" + req.http.FC-Content-Type)) {
                                if (req.http.FC-Echo == "1") {
                                        ## echo content
                                        error 623;
                                } else {
                                        ## invoke varnish parrot
                                        set req.http.FC-Echo = "1";
                                        set req.hash_ignore_busy = true;
                                        set req.hash_always_miss = true;
                                        set req.backend = localvarnish;
                                        return(lookup);
                                }
                } else {
                        error 403 "Unauthorized";
                }
        }
}

sub vcl_fetch {
        if (req.http.Forcecontent) {
                ## gzip the response before caching it
                ## TODO: maybe blacklist (image|video)/ instead and/or
                ## compress only small objects?
                if (beresp.http.Content-Type ~ "^(text|application)/") {
                        set beresp.do_gzip = true;
                }
        }
}

sub vcl_error {
        ## the varnish parrot
        ## echo content passed in Forcecontent
        if (obj.status == 623) {
                set obj.status = 200;
                set obj.response = "Ok";

                ## set response Content-Type
                if (req.http.FC-Content-Type) {
                        set obj.http.Content-Type = req.http.FC-Content-Type;
                } else {
                        set obj.http.Content-Type = "application/octet-stream";
                }

                ## set response Content-Encoding
                if (req.http.FC-Content-Encoding) {
                        set obj.http.Content-Encoding = req.http.FC-Content-Encoding;
                }

                ## set response Cache-Control
                if (req.http.FC-Cache-Control) {
                        set obj.http.Cache-Control = req.http.FC-Cache-Control;
                } else {
                        ## v-maxage needs VCL support to set beresp.ttl
                        set obj.http.Cache-Control = "v-maxage=4294967295, max-age=300";
                }

                if (req.http.FC-Base64) {
                        ## synthetic() expects a null-terminated string!
                        synthetic("" + digest.base64_decode(req.http.Forcecontent));
                        ## for binary data, use synthetic_base64_decode:
                        #digest.synthetic_base64_decode("" + req.http.Forcecontent);
                        ## or use libvmod-null and pass the decoded length in a request header
                        #null.synth(digest.base64_decode(req.http.Forcecontent), req.http.FC-Data-Length);
                } else {
                        synthetic("" + req.http.Forcecontent);
                }

                return(deliver);
        }
}

Update key/value content

#!/bin/bash

#set -x

key="$1"
valuetype="$2"
value="`cat|base64 -w0`"

ts="`date "+%s"`"

curl -H "FC-Auth: `echo -n "${ts}${value}s3cretf00${valuetype}"|sha256sum|awk '{ print $1 }'`" \
     -H "FC-TS: ${ts}" \
     -H "FC-Base64: 1" \
     -H "FC-Content-Type: ${valuetype}" \
     -H "Forcecontent: ${value}" \
     -I ${key}

Usage

Update key

## text works well
echo "Updated at `date`" | ./fc.sh http://parrot.example.com/foo text/plain

## even binary content
convert bigpicture.jpg -resize 32x32 jpg:- | ./fc.sh http://parrot.example.com/bar image/jpeg

Fetch cached value

curl -i http://parrot.example.com/foo
@pmialon
Copy link

pmialon commented Jan 4, 2013

Hi,

I have made some modification in my fork, now when an object is updated in the cache, we ensure that the old version will be remove in the next 4 minutes from the cache storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment