Created
May 27, 2010 18:15
-
-
Save avsej/416143 to your computer and use it in GitHub Desktop.
Google sitemap generator scripts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These files describes how to configure automatic google sitemap | |
generation. | |
1. In your models you should define class method which will be generate | |
array or urls with last modified timestamp. | |
2. In 'lib/tasks/google_sitemap.rb' you should update 'sources' and | |
'host' variables for your site. | |
3. You should configure cron tasks to periodically regenerate sitemap | |
and ping google. You could use nice gem whenever and sample schedule | |
in 'config/schedule.rb' | |
4. This task will generate sitemap index and several sitemaps (one for | |
each models, because google limited items in one sitemap to 50k). It | |
places sitemaps to 'public/sitemaps'. It also gzips all sitemaps to | |
save traffic. | |
5. Install sitemap in google webmaster's tools using such URL | |
'http://yoursite.com/sitemaps/index.xml.gz' | |
That's all |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'net/http' | |
require 'uri' | |
# A class specific to the application which generates a google sitemap from the contents of the database. | |
# Author: Alastair Brunton | |
# Modified: Harry Love 2008-06-09 | |
class GoogleSitemapGenerator | |
def initialize(base_url, sources) | |
@base_url = base_url | |
@sources = sources | |
end | |
# 1. Iterate through each model's #get_paths method | |
# 2. Create sitemap file for each model | |
# 3. Create sitemap index file | |
# 4. Ping Google | |
def generate | |
path_ar = [] | |
sitemaps = [] | |
@sources.each do |source| | |
# initialize the class and call the get_paths method on it. | |
path_ar = eval("#{source}.get_paths") | |
xml = generate_sitemap(path_ar) | |
save_file(source, xml) | |
end | |
index = generate_sitemap_index(@sources) | |
save_file('index', index) | |
update_google | |
end | |
# Create a sitemap document for a model | |
def generate_sitemap(path_ar) | |
xml_str = "" | |
xml = Builder::XmlMarkup.new(:target => xml_str) | |
xml.instruct! | |
xml.urlset(:xmlns => 'http://www.sitemaps.org/schemas/sitemap/0.9') { | |
path_ar.each do |path| | |
xml.url { | |
xml.loc(@base_url + path[:url]) | |
xml.lastmod(path[:last_mod]) | |
xml.changefreq('weekly') | |
} | |
end | |
} | |
xml_str | |
end | |
# Create a sitemap index document | |
def generate_sitemap_index(sitemaps) | |
xml_str = "" | |
xml = Builder::XmlMarkup.new(:target => xml_str) | |
xml.instruct! | |
xml.sitemapindex(:xmlns => 'http://www.sitemaps.org/schemas/sitemap/0.9') { | |
sitemaps.each do |site| | |
xml.sitemap { | |
xml.loc(@base_url + "/sitemaps/sitemap_#{site.underscore}.xml.gz") | |
xml.lastmod(Time.now.strftime('%Y-%m-%d')) | |
} | |
end | |
} | |
xml_str | |
end | |
# Save the xml file (gzipped) to disk | |
def save_file(source, xml) | |
FileUtils.mkdir_p(RAILS_ROOT + "/public/sitemaps/") | |
File.open(RAILS_ROOT + "/public/sitemaps/sitemap_#{source.underscore}.xml.gz", 'w+') do |f| | |
gz = Zlib::GzipWriter.new(f) | |
gz.write xml | |
gz.close | |
end | |
end | |
# Notify Google of the new sitemap index file | |
def update_google | |
sitemap_uri = @base_url + '/sitemaps/sitemap_index.xml.gz' | |
escaped_sitemap_uri = URI.escape(sitemap_uri) | |
puts 'www.google.com/webmasters/tools/ping?sitemap=' + escaped_sitemap_uri | |
puts Net::HTTP.get('www.google.com', '/webmasters/tools/ping?sitemap=' + escaped_sitemap_uri) | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Page < ActiveRecord::Base | |
def self.get_paths | |
urls = [] | |
Page.all.each do |page| | |
urls << { :url => "/pages/#{page.to_param}", :last_mod => page.updated_at.strftime('%Y-%m-%d')} | |
end | |
urls | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Post < ActiveRecord::Base | |
def self.get_paths | |
urls = [] | |
Post.all.each do |post| | |
urls << { :url => "/posts/#{post.to_param}", :last_mod => post.updated_at.strftime('%Y-%m-%d')} | |
end | |
urls | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
set :output, "#{RAILS_ROOT}/log/cron.log" | |
every 1.day do | |
rake "google_sitemap:generate" | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'google_sitemap' | |
namespace :google_sitemap do | |
desc "Generate a Google sitemap from the models" | |
task(:generate => :environment) do | |
# Generate sitemaps for each of the models listed in the array | |
sources = %w(Post Page) | |
host = ENV['HOST'] || 'http://mysite.com' | |
sitemap = GoogleSitemapGenerator.new(host, sources) | |
sitemap.generate | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment