Created
October 26, 2012 15:12
-
-
Save kardeiz/3959343 to your computer and use it in GitHub Desktop.
Convert relative links to absolute in HTML
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# encoding: utf-8 | |
# This works pretty well, but can't yet go up dir tree | |
# Also doesn't work on JS generated hrefs/srcs | |
# I was surprised there wasn't a ready-made solution online (that I could quickly find) | |
require 'open-uri' | |
require 'nokogiri' | |
require 'active_support/core_ext' | |
def convert_to_abs(base_uri,rel_uri) | |
# Note: can't follow up (i.e. "..") | |
base_uri = URI.parse(base_uri) | |
rel_uri = URI.parse(rel_uri) | |
if (rel_uri.class == URI::Generic) and not rel_uri.path.blank? | |
URI.join(base_uri,rel_uri).to_s | |
else | |
rel_uri.to_s | |
end | |
end | |
url_lookup = "http://library.tcu.edu/govweb/" | |
doc = Nokogiri::HTML(open(url_lookup)) | |
doc.xpath("//@href | //@src").each do |x| | |
x.value = convert_to_abs(url_lookup,x.value) | |
end | |
puts doc.to_html |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment