Skip to content

Instantly share code, notes, and snippets.

@mitio
Last active December 28, 2015 10:54
Show Gist options
  • Save mitio/99538c7d14b68d880db8 to your computer and use it in GitHub Desktop.
Save mitio/99538c7d14b68d880db8 to your computer and use it in GitHub Desktop.
An example with Nokogiri and UTF-8 (cyrillic)
require 'nokogiri'
require 'net/http'
html_string = Net::HTTP.get(URI.parse('http://www.dir.bg/'))
# Net::HTTP has a bug when handling encodings, see
# http://stackoverflow.com/a/13779685/75715. Alternatively,
# you can use another HTTP library, such as
# https://github.com/jnunemaker/httparty.
html_string.force_encoding('utf-8')
html_document = Nokogiri::HTML(html_string)
puts html_document.css('body h2').map(&:text)
__END__
The above code results in something like the following:
Днес
Темите 2015
Финанси
Културен афиш София
Е-Референдум
Вкусотии
Aвто
Каталог
Маркет
Кино
Телевизия
Справочник
Лайф
СпортLiveScore
Почивки
Технологии
Галерия
Времето в София
Новини от българския WEB
Зодиак
Събитиен календар
Игри
Виц на деня
Изпрати снимка и ти!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment