Skip to content

Instantly share code, notes, and snippets.

View chrisle's full-sized avatar

Chris Le (TRIODE) chrisle

View GitHub Profile
@chrisle
chrisle / gist:1885478
Created February 22, 2012 15:12
ImportXML for Google Search
=ArrayFormula(RegexReplace(RegexExtract(ImportXML("http://www.google.com/search?q=KEYWORDHERE", "//h3/a/@href"), "http.*"), "\&sa.*", ""))
@chrisle
chrisle / gist:2252209
Created March 30, 2012 15:15
CURL as GoogleBot 2.1
curl --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" -v $@
@chrisle
chrisle / gist:2319642
Created April 6, 2012 13:10
curl as ff
curl --user-agent "Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20120403211507 Firefox/12.0" http://www.google.com/search\?q\=law%20firm%20boston,%20ma
function myFunction() {
var response = UrlFetchApp.fetch("http://www.google.com/search?q=law%20firm%20boston,%20ma&num=10").getContentText();
Logger.log(response);
}
@chrisle
chrisle / gist:2785524
Created May 25, 2012 03:10
Curl as ImportXML
curl --user-agent "-" $@
@chrisle
chrisle / functions.py
Created August 14, 2012 13:08
Facebook Likes for Excel using DataNitro
""" Facebook likes for Excel using DataNitro
[email protected]
http://www.seeinteractive.com/blog/get-facebook-likes-in-excel-using-datanitro
"""
import urllib2
import json
def facebook_likes(url):
facebook_url = "https://graph.facebook.com/?ids=" + url
raw_data = urllib2.urlopen(facebook_url).read()
@chrisle
chrisle / csv_writer.rb
Created November 2, 2012 16:46
JSON > CSV > data_miner > database
# Converts JSON data into CSV and writes to a temporary CSV file
require 'ruport'
# see ruport_19.rb
require 'monkey_patches/ruport_19'
class CsvWriter
# Initialize an instance of CsvWriter
def initialize
@chrisle
chrisle / gist:4206925
Created December 4, 2012 18:01
newspaper to seomoz
require 'mechanize'
require 'linkscape'
agent = Mechanize.new
agent.user_agent_alias = 'Mac Safari'
# Put your state here
state = "PA"
page = agent.get "http://newsmap.mhlakhani.com/data/US-#{state}"
module CapybaraWithPhantomJs
include Capybara
# Create a new PhantomJS session in Capybara
def new_session
# Register PhantomJS (aka poltergeist) as the driver to use
Capybara.register_driver :poltergeist do |app|
Capybara::Poltergeist::Driver.new(app)
end
# Looks for the escaped fragment meta tag. If found, gets the HTML snapshot
# instead
module GoogleBotSimulator::EscapedFragment
def has_meta_fragment?
(@response.search('//meta[@name="fragment"]/@content').to_s == '!') ? true : false
end
def url_with_escaped_fragment(url)