Skip to content

Instantly share code, notes, and snippets.

@Cabeda
Last active January 23, 2023 12:17
Show Gist options
  • Save Cabeda/cf61ff8add369ad10178f2914da4882e to your computer and use it in GitHub Desktop.
Save Cabeda/cf61ff8add369ad10178f2914da4882e to your computer and use it in GitHub Desktop.
Deno scrapper for holland2stay residences

Scrapper house

Deno program to scrap the housing site for new houses (after getting the initial state).

How to compile

You can generate an executable by running the following

deno compile --allow-net --allow-run --allow-read --allow-env main.ts

How to run

Two env vars are needed SEND_EMAIL and PWD to send emails. If no .env is provided you can set those before calling the script

# Run script with email & notification
deno run --allow-net --allow-run --allow-read --allow-env main.ts --notify --email <email>

# Run script with notification ony
SEND_EMAIL=<email>; PWD=<pwd> ; deno run --allow-net --allow-run --allow-env --allow-read main.ts --notify

# Run script with no notifications at all
deno run --allow-net --allow-run --allow-read --allow-env main.ts

# Run compiled script with notification & 2 emails
./scrapper_house --notify --email <email_1> --email <email_2>
import { DOMParser } from "https://deno.land/x/deno_dom/deno-dom-wasm.ts";
import { notify, Notification } from "https://deno.land/x/notifier/mod.ts";
import { SmtpClient } from "https://deno.land/x/smtp/mod.ts";
import "https://deno.land/x/dotenv/load.ts";
import { parse } from "https://deno.land/[email protected]/flags/mod.ts";
const url =
"https://holland2stay.com/residences.html?available_to_book=179,336&city=29";
let houses: string[] = [];
const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));
async function sendEmail(subject: string, content: string, emails: string[]) {
const client = new SmtpClient();
const SEND_EMAIL = Deno.env.get("SEND_EMAIL");
const SEND_PWD = Deno.env.get("SEND_PWD");
if (!SEND_EMAIL || !SEND_PWD)
throw new Error(
"To send email you need to setup both SEND_EMAIL and PWD vars"
);
const connectConfig = {
hostname: "smtp.gmail.com",
port: 465,
username: SEND_EMAIL,
password: SEND_PWD,
};
await client.connectTLS(connectConfig);
for (const email of emails) {
await client.send({
from: SEND_EMAIL,
to: email,
subject: subject,
content: "",
html: content,
});
}
await client.close();
}
async function check_houses(notification: boolean, emails: string[]) {
try {
const res = await fetch(url);
const html = await res.text();
const document = new DOMParser().parseFromString(html, "text/html");
if (document) {
const houseElements = document.getElementsByClassName("regi-item");
for (const house of houseElements) {
const houseTitle = house.querySelector("h4");
if (!houseTitle)
throw new Error(
"Houses don't have an H4 set anymore. Update the filter"
);
const title = houseTitle.textContent.trim().toLowerCase();
const houseName = title
.split(" ")
.slice(0, -1)
.join(" ")
.trim()
.split(" ")
.join("-")
.replaceAll(".", "-")
.replaceAll("--", "-");
if (!houses.includes(title)) {
houses.push(title);
const notif_title = `New house found! ${title}`;
const notif_text = `Go to <link>https://holland2stay.com/${houseName}.html<link> or check the generic page <link>${url}<link>.`;
console.log(notif_title);
if (notification) {
const notif: Notification = {
title: notif_title,
message: notif_text,
sound: "device-added",
};
await notify(notif);
if (emails) sendEmail(notif_title, notif_text, emails);
}
}
}
} else {
console.log("No document found");
}
} catch (error) {
console.log(error);
}
}
const flagToAliasMap = {
notify: "n",
email: "e",
};
const flags: { [x: string]: unknown } = parse(Deno.args, {
boolean: ["notify"],
string: ["email"],
collect: ["email"],
alias: flagToAliasMap,
});
await check_houses(false, flags.email as string[]);
while (true) {
await check_houses(flags.notify as boolean, flags.email as string[]);
await sleep(10000);
console.log(`Checking again... ${new Date()}`);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment