-
-
Save thoop/8165802 to your computer and use it in GitHub Desktop.
# Change YOUR_TOKEN to your prerender token | |
# Change example.com (server_name) to your website url | |
# Change /path/to/your/root to the correct value | |
server { | |
listen 80; | |
server_name example.com; | |
root /path/to/your/root; | |
index index.html; | |
location / { | |
try_files $uri @prerender; | |
} | |
location @prerender { | |
proxy_set_header X-Prerender-Token YOUR_TOKEN; | |
set $prerender 0; | |
if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") { | |
set $prerender 1; | |
} | |
if ($args ~ "_escaped_fragment_") { | |
set $prerender 1; | |
} | |
if ($http_user_agent ~ "Prerender") { | |
set $prerender 0; | |
} | |
if ($uri ~* "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff|svg|eot)") { | |
set $prerender 0; | |
} | |
#resolve using Google's DNS server to force DNS resolution and prevent caching of IPs | |
resolver 8.8.8.8; | |
if ($prerender = 1) { | |
#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing | |
set $prerender "service.prerender.io"; | |
rewrite .* /$scheme://$host$request_uri? break; | |
proxy_pass http://$prerender; | |
} | |
if ($prerender = 0) { | |
rewrite .* /index.html break; | |
} | |
} | |
} |
I have this site: https://estimationpoker.de. It is written in Angular. Unfortunately all subsites can be crawled by google like /impressum or something except the root site. How is this possible? I am using the nginx config from here.
Hi all, I use prerender in my project for SEO. the nginx.conf is following:
`location / {
root /dist;
try_files $uri @prerender;
# try_files $uri $uri/ /index.html;
# index index.html index.htm;
}
location @prerender {
# proxy_set_header X-Prerender-Token YOUR_TOKEN;
set $prerender 0;
if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") {
set $prerender 1;
}
if ($args ~ "_escaped_fragment_") {
set $prerender 1;
}
if ($http_user_agent ~ "Prerender") {
set $prerender 0;
}
if ($uri ~* "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff|svg|eot)") {
set $prerender 0;
}
#resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
resolver 8.8.8.8;
if ($prerender = 1) {
#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
set $prerender "127.0.0.1:3000";
rewrite .* /$scheme://$host$request_uri? break;
proxy_pass http://$prerender;
break;
}
try_files $uri $uri/ /index.html;
}`
when I using the command: curl http://localhost:3000/render?url=http://localhost
The result is right. But when I using the command: curl http://localhost:3000/render?url=http://localhost/help
The result is 404. Does anyone know the reason? Please tell me. Thanks in advance!
Hi, can someone help me, I added GTmetrix to my http_user_agent, but it is not working. Google and others work though.
`
location @prerender {
proxy_set_header X-Prerender-Token XXXXXXXXX;
set $prerender 0;
if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator|GTmetrix|screaming frog seo spider|screamingfrogseospider|chrome-lighthouse") {
set $prerender 1;
}
if ($args ~ "_escaped_fragment_|prerender=1") {
set $prerender 1;
}
if ($http_user_agent ~ "Prerender") {
set $prerender 0;
}
if ($prerender = 1) {
rewrite .* /?url=$scheme://$host$request_uri? break;
proxy_pass http://prerender.freyagriculturalproducts.com;
}
if ($prerender = 0) {
proxy_pass http://127.0.0.1:5080;
}
}
`
Has anyone been able to figure out how to handle relative URLs with nginx? I am serving a SPA site over nginx and some paths that webpack builds like CSS are relative. So the site does not render properly.
Support docsearch-scraper?
https://github.com/algolia/docsearch-scraper/pull/431/files
For those struggling with the nginx configuration on SPAs (react, angular, vue.js), here's an overview on how it should work:
How the @prerender location works 🥇 :
Given we have a default location (/) with "try_files $uri @prerender", here's what happens:
First, nginx will try to find a real file on the directory, such as images, js or css files.
If there is a match, nginx will return that file.
If no real file is found, nginx will try the @prerender location:
If the user agent is a bot (google, twitter, etc), set a variable $prerender to 1
If the user agent is prerender itself, set it to 0 back again
If the uri is for a file such as js, css, imgs, set it to 0 again
Now, if after all of these conditions we have $prerender==1, we will;
Proxy pass the request to prerender and return the cached html file
If we have prerender=0:
Default to our own index.html.
Since this is a SPA, all uris that are not a real file will be redirected to index.html
Caveats 🚨
- The
if
directive in nginx can be tricky (please read https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/) - In summary, we should never use try_files and ifs in the same location block.
- This is the reason why we have to use rewrite on the @location block instead of try_files.
Final nginx conf with SSL 🔒
It's very similar to the first example, I just needed to add a root path in each location:
# MANAGED BY PUPPET
server {
listen *:443 ssl;
server_name example.com;
ssl_certificate /etc/nginx/ssl/example.crt;
ssl_certificate_key /etc/nginx/ssl/example.key;
access_log /var/log/nginx/example.log combined_cloudflare;
error_log /var/log/nginx/ssl-example.error.log;
add_header "X-Clacks-Overhead" "GNU Terry Pratchett";
location / {
root /usr/local/example/webapp-build;
try_files $uri @prerender;
add_header Cache-Control "no-cache";
}
# How the @prerender location works:
# Given we have a default location (/) with "try_files $uri @prerender", here's what happens:
# First, nginx will try to find a real file on the directory, such as images, js or css files.
# If there is a match, nginx will return that file.
# If no real file is found, nginx will try the @prerender location:
# If the user agent is a bot (google, twitter, etc), set a variable $prerender to 1
# If the user agent is prerender itself, set it to 0 back again
# If the uri is for a file such as js, css, imgs, set it to 0 again
# Now, if after all of these conditions we have $prerender==1, we will;
# Proxy pass the request to prerender and return the cached html file
# If we have prerender=0:
# Default to our own index.html.
# Since this is a SPA, all uris that are not a real file will be redirected to index.html
# **** CAVEATS ****
# - The `if` directive in nginx can be tricky (please read https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/)
# - In summary, we should never use try_files and ifs in the same location block.
# - This is the reason why we have to use rewrite on the @location block instead of try_files.
location @prerender {
root /usr/local/example/webapp-build;
proxy_set_header X-Prerender-Token "YOUR_TOKEN_GOES_HERE";
set $prerender 0;
if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") {
set $prerender 1;
}
if ($args ~ "_escaped_fragment_") {
set $prerender 1;
}
if ($http_user_agent ~ "Prerender") {
set $prerender 0;
}
if ($uri ~* "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff|svg|eot)") {
set $prerender 0;
}
#resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
resolver 8.8.8.8;
if ($prerender = 1) {
#setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
set $prerender "service.prerender.io";
rewrite .* /$scheme://$host$request_uri? break;
proxy_pass http://$prerender;
break;
}
rewrite .* /index.html break;
}
}
If you are using puppet to manage your infrastructure... 🤖
I preferred creating a template file with the @prerender
location for readability. Additionally, this config should only be done in production, so I added some conditionals as well:
if $environment == 'production' {
# prerender.io is a SEO tool used to process javascript websites into a robot-friendly page.
# Please read the details on profile/webapp/prerender_nginx.erb
$nginx_try_files = ['$uri', '@prerender'] # the $uri must be single quoted!
$nginx_raw_append = template('profile/webapp/prerender_nginx.erb') # Only the @prerender location goes in this block
} else {
$nginx_try_files = ['$uri', '/index.html'] # the $uri must be single quoted!
$nginx_raw_append = []
}
nginx::resource::server { $name:
ensure => present,
server_name => $server_name,
listen_port => 443,
ssl_port => 443,
ssl => true,
ssl_cert => $ssl_cert,
ssl_key => $ssl_key,
format_log => 'combined_cloudflare',
www_root => $www_root,
index_files => $index_files,
error_pages => $error_pages,
try_files => $nginx_try_files,
location_cfg_append => $location_cfg_append,
raw_append => $nginx_raw_append,
}
```
Hi everyone!
there is an updated, official set of nginx configuration files here:
https://github.com/prerender/prerender-nginx
Does anyone know what should be the UA string for Google Sites (https://sites.google.com)? Perhaps they use something else other than 'googlebot' since I can't get link unfurling to work when using their "Embed from the web" widget.
Thank you for this @thoop! Similar to
slackbot
, I request you to please adddiscordbot
to the list of bots - https://developers.whatismybrowser.com/useragents/parse/572332-discord-bot