Skip to content

Instantly share code, notes, and snippets.

@peterjaap
Last active April 3, 2025 19:34
Show Gist options
  • Save peterjaap/006169c5d95eeffde3a1cc062de1b514 to your computer and use it in GitHub Desktop.
Save peterjaap/006169c5d95eeffde3a1cc062de1b514 to your computer and use it in GitHub Desktop.
Updated Magento 2 Varnish 6 VCL, in cooperation with Varnish Software, see Xkey soft purge version here; https://gist.github.com/peterjaap/7f7bf11aa7d089792e8fcc2fb34760fa
# A number of these changes come form the following PR's; , combines changes in https://github.com/magento/magento2/pull/29360, https://github.com/magento/magento2/pull/28944 and https://github.com/magento/magento2/pull/28894, https://github.com/magento/magento2/pull/35228, https://github.com/magento/magento2/pull/36524, https://github.com/magento/magento2/pull/34323
# VCL version 5.0 is not supported so it should be 4.0 even though actually used Varnish version is 6
# See the Xkey version here: https://gist.github.com/peterjaap/7f7bf11aa7d089792e8fcc2fb34760fa
vcl 4.1;
import std;
# The minimal Varnish version is 6.0
# For SSL offloading, pass the following header in your proxy server or load balancer: '/* {{ ssl_offloaded_header }} */: https'
backend default {
.host = "/* {{ host }} */";
.port = "/* {{ port }} */";
.first_byte_timeout = 600s;
# .probe = {
# .url = "/health_check.php";
# .timeout = 2s;
# .interval = 5s;
# .window = 10;
# .threshold = 5;
# }
}
acl purge {
/* {{ ips }} */
}
sub vcl_recv {
# Add support for Prismic preview functionality
if (req.http.Cookie ~ "io.prismic.preview") {
return (pass);
}
# Remove empty query string parameters
# e.g.: www.example.com/index.html?
if (req.url ~ "\?$") {
set req.url = regsub(req.url, "\?$", "");
}
# Remove port number from host header
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
# Sorts query string parameters alphabetically for cache normalization purposes
set req.url = std.querysort(req.url);
# Remove the proxy header to mitigate the httpoxy vulnerability
# See https://httpoxy.org/
unset req.http.proxy;
# Add X-Forwarded-Proto header when using https
if (!req.http.X-Forwarded-Proto && (std.port(server.ip) == 443 || std.port(server.ip) == 8443)) {
set req.http.X-Forwarded-Proto = "https";
}
# Reduce grace to the configured setting if the backend is healthy
# In case of an unhealthy backend, the original grace is used
if (std.healthy(req.backend_hint)) {
set req.grace = /* {{ grace_period }} */s;
}
# Purge logic to remove objects from the cache
# Tailored to Magento's cache invalidation mechanism
# The X-Magento-Tags-Pattern value is matched to the tags in the X-Magento-Tags header
# If X-Magento-Tags-Pattern is not set, a URL-based purge is executed
if (req.method == "PURGE") {
if (client.ip !~ purge) {
return (synth(405, "Method not allowed"));
}
if (!req.http.X-Magento-Tags-Pattern) {
return (purge);
}
ban("obj.http.X-Magento-Tags ~ " + req.http.X-Magento-Tags-Pattern);
return (synth(200, "Purged"));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "PATCH" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
return (pipe);
}
# We only deal with GET and HEAD by default
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
# Bypass health check requests
if (req.url ~ "^/(pub/)?(health_check.php)$") {
return (pass);
}
# Collapse multiple cookie headers into one
std.collect(req.http.Cookie, ";");
# Remove all marketing get parameters to minimize the cache objects
if (req.url ~ "(\?|&)(_branch_match_id|srsltid|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|sb_referer_host|ScCid|si|siteurl|s_kwcid|sms_click|sms_source|sms_uph|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|wbraid|yclid|zanpid|mc_[a-z]+|utm_[a-z]+|_bta_[a-z]+)=") {
set req.url = regsuball(req.url, "(_branch_match_id|srsltid|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|sb_referer_host|ScCid|si|siteurl|s_kwcid|sms_click|sms_source|sms_uph|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|wbraid|yclid|zanpid|mc_[a-z]+|utm_[a-z]+|_bta_[a-z]+)=[-_A-z0-9+(){}%.]+&?", "");
set req.url = regsub(req.url, "[?|&]+$", "");
}
# Static files caching
if (req.url ~ "^/(pub/)?(media|static)/") {
# Static files should not be cached by default
return (pass);
# But if you use a few locales and don't use CDN you can enable caching static files by commenting previous line (#return (pass);) and uncommenting next 3 lines
#unset req.http.Https;
#unset req.http./* {{ ssl_offloaded_header }} */;
#unset req.http.Cookie;
}
# Don't cache the authenticated GraphQL requests
if (req.url ~ "/graphql" && req.http.Authorization ~ "^Bearer") {
return (pass);
}
return (hash);
}
sub vcl_hash {
if (req.url !~ "/graphql" && req.http.cookie ~ "X-Magento-Vary=") {
hash_data(regsub(req.http.cookie, "^.*?X-Magento-Vary=([^;]+);*.*$", "\1"));
}
# To make sure http users don't see ssl warning
hash_data(req.http./* {{ ssl_offloaded_header }} */);
/* {{ design_exceptions_code }} */
if (req.url ~ "/graphql") {
if (req.http.X-Magento-Cache-Id) {
hash_data(req.http.X-Magento-Cache-Id);
} else {
# if no X-Magento-Cache-Id (which already contains Store & Currency) is not set, use the HTTP headers
hash_data(req.http.Store);
hash_data(req.http.Content-Currency);
}
}
}
sub vcl_backend_response {
# Serve stale content for three days after object expiration
# Perform asynchronous revalidation while stale content is served
set beresp.grace = 3d;
# All text-based content can be parsed as ESI
if (beresp.http.content-type ~ "text") {
set beresp.do_esi = true;
}
# Allow GZIP compression on all JavaScript files and all text-based content
if (bereq.url ~ "\.js$" || beresp.http.content-type ~ "text") {
set beresp.do_gzip = true;
}
# Add debug headers
if (beresp.http.X-Magento-Debug) {
set beresp.http.X-Magento-Cache-Control = beresp.http.Cache-Control;
}
# Only cache HTTP 200 and HTTP 404 responses
if (beresp.status != 200 && beresp.status != 404) {
set beresp.ttl = 120s;
set beresp.uncacheable = true;
return (deliver);
}
# Don't cache if the request cache ID doesn't match the response cache ID for graphql requests
if (bereq.url ~ "/graphql" && bereq.http.X-Magento-Cache-Id && bereq.http.X-Magento-Cache-Id != beresp.http.X-Magento-Cache-Id) {
set beresp.ttl = 120s;
set beresp.uncacheable = true;
return (deliver);
}
# Remove the Set-Cookie header for cacheable content
# Only for HTTP GET & HTTP HEAD requests
if (beresp.ttl > 0s && (bereq.method == "GET" || bereq.method == "HEAD")) {
unset beresp.http.Set-Cookie;
}
}
sub vcl_deliver {
if (obj.uncacheable) {
set resp.http.X-Magento-Cache-Debug = "UNCACHEABLE";
} else if (obj.hits) {
set resp.http.X-Magento-Cache-Debug = "HIT";
set resp.http.Grace = req.http.grace;
} else {
set resp.http.X-Magento-Cache-Debug = "MISS";
}
# Not letting browser to cache non-static files.
if (resp.http.Cache-Control !~ "private" && req.url !~ "^/(pub/)?(media|static)/") {
set resp.http.Pragma = "no-cache";
set resp.http.Expires = "-1";
set resp.http.Cache-Control = "no-store, no-cache, must-revalidate, max-age=0";
}
if (!resp.http.X-Magento-Debug) {
unset resp.http.Age;
}
unset resp.http.X-Magento-Debug;
unset resp.http.X-Magento-Tags;
unset resp.http.X-Powered-By;
unset resp.http.Server;
unset resp.http.X-Varnish;
unset resp.http.Via;
unset resp.http.Link;
}
@boldbart
Copy link

boldbart commented Jan 7, 2025

Hello @peterjaap we had an issue where the _gl marketing param was not removed. This is due to a asterisk immediately after a dot, like:

_gl=1*c8cxdj*_up*MQ..*_ga*NjM2NzIxOTMuMTczNjI1MDMxMw..*_ga_Q8J1SS3NWQ*MTczNjI1MDMxMi4xLjAuMTczNjI1MDMxMi4wLjAuMTc0OTczMDA1MA..

The fix is to add an asterisk on line 104 near the end.

Before:

[-_A-z0-9+(){}%.]

After:

[-_A-z0-9+(){}%.*]

Or for easy copy/pasting:

set req.url = regsuball(req.url, "(_branch_match_id|srsltid|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|sb_referer_host|ScCid|si|siteurl|s_kwcid|sms_click|sms_source|sms_uph|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|wbraid|yclid|zanpid|mc_[a-z]+|utm_[a-z]+|_bta_[a-z]+)=[-_A-z0-9+(){}%.*]+&?", "");

@albertobraschi
Copy link

@ThijsFeryn
Copy link

how about brotli?

https://gitlab.com/uplex/varnish/libvfp-brotli

@albertobraschi it's tricky to rely on features that aren't bundled in the Varnish core. This Brotli VFP filter + VMOD needs to be installed separately.

VCL cannot load modules conditionally, and it would trigger VCL compilation errors for users who don't have the Brotli module/filter installed on their system. This would cause more problems than it would solve, unfortunately.

Feel free to include this logic into your own VCL template though.

@albertobraschi
Copy link

how about brotli?
https://gitlab.com/uplex/varnish/libvfp-brotli

@albertobraschi it's tricky to rely on features that aren't bundled in the Varnish core. This Brotli VFP filter + VMOD needs to be installed separately.

VCL cannot load modules conditionally, and it would trigger VCL compilation errors for users who don't have the Brotli module/filter installed on their system. This would cause more problems than it would solve, unfortunately.

Feel free to include this logic into your own VCL template though.

exactly!
this module, brotli, from the link is used
it works very well!
however, you need to install varnish from source
the performance improves absurdly!!!
i use for magento kkkk

@ThijsFeryn
Copy link

ThijsFeryn commented Jan 9, 2025

@peterjaap based on the feed back from @boldbart & @erikhansen, I came up with the following updated VCL snippet to strip marketing parameters:

    # Remove tracking query string parameters used by analytics tools
    if (req.url ~ "(\?|&)(_branch_match_id|_bta_[a-z]+|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_[a-z]+|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|s_kwcid|sb_referer_host|sccid|si|siteurl|sms_click|sms_source|sms_uph|srsltid|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_[a-z]+|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|vmcid|wbraid|yclid|zanpid)=") {
        set req.url = regsuball(req.url, "(_branch_match_id|_bta_[a-z]+|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_[a-z]+|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|s_kwcid|sb_referer_host|sccid|si|siteurl|sms_click|sms_source|sms_uph|srsltid|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_[a-z]+|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|vmcid|wbraid|yclid|zanpid)=[-_A-z0-9+(){}%.*]+&?", "");
        set req.url = regsub(req.url, "[?|&]+$", "");
    }

I also sorted the values alphabetically to improve readability.

@boldbart @erikhansen please test it and let me know if there's issues with this snippet.

@erikhansen
Copy link

@ThijsFeryn First, I updated my comment above to reflect the correct query parameter of srsltid (thanks to @mpchadwick catching my mistake). Can you updated your comment above to change stsltid to srsltid so my mistake doesn't persist?

Second, I compared your suggested VCL string to the CSV that @mpchadwick maintains and found these differences:

Parameters only in your suggested VCL string (@mpchadwick I don't have time to research and submit these as a PR to your CSV, but next time you update that CSV, you might want to look at these):

  • ie
  • nb_klid
  • origin
  • siteurl
  • zanpid

Parameters only in the CSV (@ThijsFeryn I would recommend adding this to your comment above):

@ThijsFeryn
Copy link

@ThijsFeryn First, I updated my comment above to reflect the correct query parameter of srsltid (thanks to @mpchadwick catching my mistake). Can you updated your comment above to change stsltid to srsltid so my mistake doesn't persist?

Second, I compared your suggested VCL string to the CSV that @mpchadwick maintains and found these differences:

Parameters only in your suggested VCL string (@mpchadwick I don't have time to research and submit these as a PR to your CSV, but next time you update that CSV, you might want to look at these):

  • ie
  • nb_klid
  • origin
  • siteurl
  • zanpid

Parameters only in the CSV (@ThijsFeryn I would recommend adding this to your comment above):

@erikhansen I removed stsltid and added vmcid.

I'll keep the other ones you mentioned (ie, nb_klid, origin, siteurl & zanpid), but I'll consider removing them if it turns out they're not useful.

FYI: I publish the optimized VCL file on our developer portal. The Magento tutorial lives here.

We also maintain a VCL template for Magento that leverages some of our commercial features. You can find the relevant part of the VCL file here.

@ThijsFeryn
Copy link

@peterjaap I also suggest we remove the following snippet from the VCL file:

if (req.http.X-Pool) {
    ban("obj.http.X-Pool ~ " + req.http.X-Pool);
}

Apparently it's not related to Magento at all, and is only used by Capistrano.

Do you agree?

@peterjaap
Copy link
Author

@ThijsFeryn agreed!

@ThijsFeryn
Copy link

ThijsFeryn commented Jan 15, 2025

@peterjaap This would be the updated purge logic, now that X-Pool is stripped off:

    # Purge logic to remove objects from the cache
    # Tailored to Magento's cache invalidation mechanism
    # The X-Magento-Tags-Pattern value is matched to the tags in the X-Magento-Tags header
    # If X-Magento-Tags-Pattern is not set, a URL-based purge is executed
    if (req.method == "PURGE") {
        if (client.ip !~ purge) {
            return (synth(405, "Method not allowed"));
        }
        if (!req.http.X-Magento-Tags-Pattern) {
            return (purge);
        }
        ban("obj.http.X-Magento-Tags ~ " + req.http.X-Magento-Tags-Pattern);
        return (synth(200, "Purged"));
    }

Can you update your gist accordingly?

@ThijsFeryn
Copy link

@peterjaap std.collect(req.http.Cookie); should be changed to std.collect(req.http.Cookie, ";");.

Can you please adjust the gist?

@peterjaap
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment