Last active
July 15, 2023 02:45
-
-
Save roylines/7dc14c419523975daaa7 to your computer and use it in GitHub Desktop.
haproxy configuration for using with prerender.io
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Change YOUR_TOKEN to your prerender token | |
# Change http://example.com (server_name) to your website url | |
frontend my-frontend | |
mode http | |
bind :80 | |
# prerender.io | |
acl user-agent-bot hdr_sub(User-Agent) -i baiduspider twitterbot facebookexternalhit rogerbot linkedinbot embedly showyoubot outbrain pinterest slackbot vkShare W3C_Validator | |
acl url-asset path_end js css xml less png jpg jpeg gif pdf doc txt ico rss zip mp3 rar exe wmv doc avi ppt mpg mpeg tif wav mov psd ai xls mp4 m4a swf dat dmg iso flv m4v torrent ttf woff | |
acl url-escaped-fragment url_sub _escaped_fragment_ | |
use_backend prerender if user-agent-bot !url-asset | |
use_backend prerender if url-escaped-fragment !url-asset | |
backend prerender | |
mode http | |
timeout server 20s | |
server prerender service.prerender.io:443 check ssl verify none | |
http-request set-header X-Prerender-Token YOUR_TOKEN | |
reqrep ^([^\ ]*)\ /(.*)$ \1\ /http://example.com/\2 |
Saved me lots of time! Thanks! 👍
Since Google now supports dynamic rendering, the googlebot
agent should be added to the user-agent-bot acl
acl user-agent-bot hdr_sub(User-Agent) -i googlebot baiduspider twitterbot facebookexternalhit rogerbot linkedinbot embedly showyoubot outbrain pinterest slackbot vkShare W3C_Validator
For those looking at migrating over to HAProxy 2.1 and above, the configuration file below is what we're using in production. It contains a few crucial additions such as the resolvers dns
section to keep HAProoxy happy. Cheers!
# Important: Prerender's IP address changes regularly. This ensures HAProxy always resolves the new IP.
resolvers dns
parse-resolv-conf
hold valid 10s
frontend my-frontend
mode http
bind *:80
# Detect bot crawlers looking for pre-rendered pages
acl user-agent-bot hdr_sub(User-Agent) -i googlebot bingbot baiduspider twitterbot facebookexternalhit rogerbot linkedinbot embedly showyoubot outbrain pinterest slackbot whatsapp vkShare W3C_Validator
acl url-asset path_end js css xml less png jpg jpeg gif pdf doc txt ico
use_backend prerender if user-agent-bot !url-asset
backend prerender
mode http
server prerender service.prerender.io:443 resolvers dns check ssl verify none # Note the "resolvers dns" here
http-request set-header X-Prerender-Token YOUR_TOKEN
http-request set-uri /http://example.com%[path] # set-uri will remove the query string, use set-path if you want to keep it
Updated user-agent list
acl user-agent-bot hdr_sub(User-Agent) -i googlebot bingbot baiduspider twitterbot facebookexternalhit rogerbot linkedinbot embedly showyoubot outbrain pinterest slackbot whatsapp vkShare W3C_Validator redditbot Applebot whatsapp flipboard tumblr bitlybot skypeuripreview nuzzel discordbot google page speed qwantify pinterestbot bitrix link preview xing-contenttabreceiver chrome-lighthouse telegrambot
2023, I used a combination of these comments and got it working. Thank you community.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It may actually be better to whitelist extensions rather than have a huge blacklist. Since the only pages that need to be prerendered are html pages, I think it would be safe to assume only
.html
,.htm
, and no extension at all. This should do the trick: