Skip to content

Instantly share code, notes, and snippets.

@takumikinjo
Created March 11, 2010 11:49
Show Gist options
  • Save takumikinjo/329055 to your computer and use it in GitHub Desktop.
Save takumikinjo/329055 to your computer and use it in GitHub Desktop.
# -*- coding: utf-8 -*-
#!/usr/bin/env ruby
# GET english examples from SpaceAlc
require 'uri'
require 'open-uri'
require 'rubygems'
require 'nokogiri'
class SpaceAlc
def self.base_uri
URI 'http://eow.alc.co.jp/'
end
def initialize(word)
@uri = self.class.base_uri
@uri.path = '/' + word.gsub(/\s+/, '+') + '/UTF-8/'
end
def to_hash
if defined?(@_to_hash)
@_to_hash
else
doc = Nokogiri::HTML(open(@uri, 'User-Agent' => 'Mozilla/5.0'))
doc.xpath('//div[@id="resultList"]//li').inject({ }) do |@_to_hash, item|
@_to_hash.merge({ item.xpath('span[@class="midashi"]').inner_text =>
item.xpath('div[1]').inner_text })
end
end
end
def to_portable
Hash[*to_hash.collect { |word, meaning|
[word, to_trim(meaning)] if portable?(word)
}.flatten.compact]
end
private
def portable?(word)
len = word.split(/\s+/).length
len == 1 or (len > 3 and len < 6) or (len > 10 and len < 15)
end
def to_trim(meaning)
meaning.gsub /【.*?】|〔.*?〕|{.*?}|《.*?》|◆.*$/, ''
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment