Skip to content

Instantly share code, notes, and snippets.

@thanhleviet
Created March 10, 2019 16:23
Show Gist options
  • Save thanhleviet/b8df27b5bbcb5a518ea1e896b3fe7061 to your computer and use it in GitHub Desktop.
Save thanhleviet/b8df27b5bbcb5a518ea1e896b3fe7061 to your computer and use it in GitHub Desktop.
library(RSelenium)
library(dplyr)
library(rvest)
library(tibble)
#add url
url <- "https://www.toimitilat.fi/toimitilahaku/?size_min=&size_max=&deal_type%5B%5D=1&language=fin&result_type=list&advanced=0&gbl=1&ref=main#searchresult"
# docker run -d -p 4445:4444 selenium/standalone-firefox:2.53.1
remDr <- remoteDriver(
remoteServerAddr = "127.0.0.1",
port = 4445L,
browserName = "firefox"
)
remDr$open()
remDr$navigate(url)
html_source <- remDr$getPageSource()[[1]]
objects <- read_html(html_source) %>%
html_nodes("div.infoCont")
info <- sapply(objects, function (x) x %>% html_nodes("h4") %>% html_text() %>% gsub("[\n|\t]","",.))
price <- sapply(objects, function (x) x %>% html_nodes("div.priceCont") %>% html_text() %>% gsub("[\n|\t]","",.))
ct <- data_frame(title = info, price = price)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment