Skip to content

Instantly share code, notes, and snippets.

@alea12
Last active December 31, 2015 02:58
Show Gist options
  • Save alea12/7924017 to your computer and use it in GitHub Desktop.
Save alea12/7924017 to your computer and use it in GitHub Desktop.
テレビ王国 (tv.so-net.ne.jp) のテレビ番組表から、番組詳細情報を取得する
require 'open-uri'
require 'nokogiri'
# date
date = '20131213'
# open tv.so-net.ne.jp and parse it by Nokogiri
target = 'http://tv.so-net.ne.jp/chart/23.action?head=' + date + '0000&span=24'
doc = Nokogiri::HTML(open(target))
# get links to program detail pages
programs = doc.css('a.schedule-link')
program_detail_links = []
programs.each do |program|
# url
program_detail_links.push program.attributes['href'].value
# title
# puts program.children.css('span.schedule-title').children.to_s
end
foreach program_links do |program_detail_link|
Nokogiri::HTML(open(program_detail_link)).css('p.basicTxt').each do |text|
pp text
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment