-
-
Save kixorz/3b172e2fc3ce35421ee9 to your computer and use it in GitHub Desktop.
var http = require('http'); | |
exports.handler = function(event, context) { | |
http.get('http://httpbin.org/get', function(res) { | |
var body = ''; | |
res.on('data', function(chunk) { | |
body += chunk; | |
}); | |
res.on('end', function() { | |
console.info(body); | |
context.done(null); | |
}); | |
}).on('error', function(e) { | |
console.error(e.message); | |
context.fail(null); | |
}); | |
}; |
kixorz
commented
Jul 2, 2019
via email
you don't have to move those bytes yourself.
Axios
does this code assure us that lambda function uses different IP addresses for scrapping to avoid being blocked by Captcha? thanks
Axios is a pretty cool library, but this code is on purpose free of third party libraries.
@lydiahelkinz This code only tells you what the current public IP of the Lambda is.
If you're looking at scraping, you should consider using a good proxy service, implement a proxy layer in your code and block certain tracking requests.
can i use lambda function without going through proxy services
@lydiahelkinz Yes, you can. Depends on what you're scraping. On some sites you will be very successful, on others it will fail. If you hit sites running on say CloudFlare or a similar CDNs with adaptive firewalls and bot protections, you will need to advance your game as the default Lambdas won't be enough. You may still use Lambdas as your compute platform, but the HTTP handling and network access will need to be more advanced.