Skip to content

Instantly share code, notes, and snippets.

@srfrnk
Created July 5, 2014 09:33
Show Gist options
  • Save srfrnk/5f67771e218fa772fc21 to your computer and use it in GitHub Desktop.
Save srfrnk/5f67771e218fa772fc21 to your computer and use it in GitHub Desktop.
A NodeJs ExpressJS middleware to allow bot/crawler specific routes. Based on the OS project Prerender.IO middleware for node.
require("requirejs").define("middleware/onlyCrawlers", ["url"], function (url) {
var crawlerUserAgents = [
// 'googlebot',
// 'yahoo',
// 'bingbot',
'baiduspider',
'facebookexternalhit',
'twitterbot',
'rogerbot',
'linkedinbot',
'embedly',
'quora link preview',
'showyoubot',
'outbrain',
'pinterest'
];
return function (req, res, next) {
var userAgent = req.headers['user-agent']
, bufferAgent = req.headers['x-bufferbot']
, isCrawler = false;
if (!!userAgent && req.method == 'GET') {
//if it contains _escaped_fragment_, show prerendered page
if (url.parse(req.url, true).query.hasOwnProperty('_escaped_fragment_')) {
isCrawler = true;
}
//if it is a bot...show prerendered page
if (crawlerUserAgents.some(function (crawlerUserAgent) {
return userAgent.toLowerCase().indexOf(crawlerUserAgent.toLowerCase()) !== -1;
})) {
isCrawler = true;
}
//if it is BufferBot...show prerendered page
if (bufferAgent) {
isCrawler = true;
}
if (isCrawler)
{
next();
}
else{
next("route");
}
}
else {
next("route");
}
};
});
@srfrnk
Copy link
Author

srfrnk commented Jul 5, 2014

Best for specific routes:
app.get("/..../.../...",require("middleware/onlyCrawlers"),myRouteHandler);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment