Skip to content

Instantly share code, notes, and snippets.

@qrilka
Created April 18, 2014 11:08
Show Gist options
  • Save qrilka/11038169 to your computer and use it in GitHub Desktop.
Save qrilka/11038169 to your computer and use it in GitHub Desktop.
scrapinghub through tinyproxy
$ http_proxy=http://localhost:8080 curl -v http://scrapinghub.com/ 2>&1 | head -100
* Hostname was NOT found in DNS cache
* Trying ::1...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* connect to ::1 port 8080 failed: В соединении отказано
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET http://scrapinghub.com/ HTTP/1.1
> User-Agent: curl/7.36.0
> Host: scrapinghub.com
> Accept: */*
> Proxy-Connection: Keep-Alive
>
0 0 0 0 0 0 0 0 --:--:-- 0:00:05 --:--:-- 0< HTTP/1.1 200 OK
< Via: 1.1 tinyproxy (tinyproxy/1.8.3)
< Date: Fri, 18 Apr 2014 11:07:47 GMT
< Content-Type: text/html; charset=utf-8
* Server Apache is not blacklisted
< Server: Apache
< Vary: Accept-Encoding
* no chunk, no close, no size. Assume close to signal end
<
{ [data not shown]
<!DOCTYPE html>
<html>
<head>
<title>Scrapinghub | Turn web pages into structured content</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<meta name="Author" content="Scrapinghub" />
<meta name="Publisher" content="Scrapinghub" />
<meta name="Copyright" content="Scrapinghub" />
<meta name="Robots" content="ALL" />
<link rel="stylesheet" media="screen, projection" type="text/css" href="/static/styles/screen.css" />
<link rel="stylesheet" href="/static/colorbox/colorbox.css" />
<link rel="stylesheet" href="/static/fancybox/jquery.fancybox-1.3.4.css" type="text/css" media="screen" />
<link href='http://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="/static/styles/bootstrap.min.css" />
<link href="//netdna.bootstrapcdn.com/font-awesome/3.2.1/css/font-awesome.css" rel="stylesheet">
<link rel="shortcut icon" href="/static/images/favicon.ico" />
<!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="/static/scripts/html5shiv.js"></script>
<script src="/static/scripts/respond.min.js"></script>
<![endif]-->
</head>
<body class="home">
<div id="wrap-body" class="wrap">
<div class="container">
<div class="header clearfix">
<div class="logo"><a href="/">Scrapinghub</a></div>
<div class="signup">
<a class="btn btn-group btn-success" href="https://dash.scrapinghub.com/account/signup/">Sign up</a>
<a class="btn btn-group btn-default" href="https://dash.scrapinghub.com/">Sign in</a>
</div>
<div class="mainnav">
<ul class="nav">
<li><a href="/features">Features</a></li>
<li><a href="/pricing">Pricing</a></li>
<li><a href="/services">Consulting</a></li>
<li><a href="/contact">Contact</a></li>
</ul>
</div>
</div>
<div class="wrap-top"></div><!-- top border -->
<div class="body clearfix">
<div class="home-content">
<div class="whatwedo">
We provide the leading <b>technology</b> and <b>consulting</b> services to deliver successful web crawling and data processing solutions.
</div>
<h2>Our Services</h2>
<div class="p-box right">
<div class="p-logo">
<a href="/scrapy-cloud"><img src="/static/images/scrapy-cloud.png"></a><br>
<h3><a href="/scrapy-cloud">Scrapy Cloud</a></h3>
</div>
<p>Scrapy Cloud is our platform as a service to build crawlers easily, deploy
them instantly and scale them on demand. Watch your <a
href="http://scrapy.org">Scrapy</a> spiders as they run and collect data,
and review their data through our beautiful frontend.</p>
</div>
<div class="p-box">
<div class="p-logo">
<a href="/services"><img src="/static/images/puzzle.png"></a><br>
<h3><a href="/services">Professional Services</a></h3>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment