Last active
August 29, 2015 14:16
-
-
Save Andsbf/0e64c54d151c30dbd174 to your computer and use it in GitHub Desktop.
Web Scrapper Exercise
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Comment | |
attr_accessor :content | |
def initialize (content) | |
@content = content | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'pry' | |
require 'colorize' | |
require_relative 'post' | |
ARGV[0] | |
begin | |
post = Post.new(ARGV[0]) | |
post.print_details | |
rescue | |
p "Invalid URL" | |
end | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#classes | |
require 'nokogiri' | |
require 'open-uri' | |
require_relative 'comment' | |
class Post | |
attr_accessor :title ,:url, :points, :item_id, :content | |
def initialize(url) | |
@content= Nokogiri::HTML(open(url)) | |
@title = content.search('.title > a').map { |a| a.inner_text}[0].sub(/\u00E2\u0080\u0093/,'-') | |
@url = url | |
@points = content.search('.subtext > span:first-child').map { |span| span.inner_text} | |
@item_id = (/=\d+/.match(url)[0]).match(/\d+/)[0] | |
@comments_array = content.search('.comment').map { |comment| Comment.new(comment.inner_text) } | |
end | |
def show_comments | |
@comments_array.each{|each_comment| p each_comment} | |
end | |
def add_comment(comment_obj) | |
@comments_array.push(comment_obj) | |
end | |
def print_details | |
puts "Post title: #{title}".colorize(:black).colorize(:background => :white) | |
puts "Number of comments: #{@comments_array.length}".colorize(:white).colorize( :background => :red).blink | |
end | |
end | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Some code review for this:
Small thing, but you should take special care to make sure indentation is correct. Your
initialize
is unnecessary indented incomment.rb
:https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-comment-rb-L5
Maybe this was a mistake, but this line doesn't really do anything:
https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-main-rb-L6
Incase it's not clear,
ARGV
is just an array that contains all of the arguments passed into calling your program.You didn't need to set
@url
,@points
,@item_id
, as you don't do anything with them after they are set. https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L13You never call
show_comments
oradd_comments
, you can probably get rid of them: https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L20 and https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L24You shouldn't have to convert UTF-8 characters into
-
s. There should be a way to specify the encoding. Seems like the answer is somewhere in here (but I haven't tried it myself):https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L12
If you have any responses feel free to comment back here!