Skip to content

Instantly share code, notes, and snippets.

View duanebester's full-sized avatar

Duane Bester duanebester

View GitHub Profile
@duanebester
duanebester / find-products.clj
Last active October 28, 2021 19:41
Scraping - find-products
(defn get-content
[content]
(into {} (map (fn [c] [(:tag c) (first (:content c))]) (:content content))))
(defn find-products
"Crawls the url's sitemap and for each product, will call f with the site and the product"
[site url f]
(let [p (parse-xml url)
tag (:tag p)]
(println (str "Parsing URL: " url " for site " site))
@duanebester
duanebester / parse-sitemap.xml.clj
Created October 27, 2021 20:54
Web Scraping Parse sitemap.xml
(require '[clojure.xml :as xml])
(defn parse-xml [url]
(try (xml/parse url)
(catch Exception e
(println
(str "Caught Exception parsing xml: " (.getMessage e))))))
(parse-xml "https://bellroy.com/sitemap.xml")
@duanebester
duanebester / parsed-sitemap.xml.edn
Created October 27, 2021 20:52
Web Scraping w/ Clojure - Parsed sitemap
{
:tag :urlset,
:content [
{:tag :url, :attrs nil, :content [
{:tag :loc, :attrs nil, :content ["https://bellroy.com/journal/carry-portraits-anna-marrone"]}
{:tag :lastmod, :attrs nil, :content ["2021-10-06"]}
{:tag :changefreq, :attrs nil, :content ["daily"]}
{:tag :priority, :attrs nil, :content ["1.0"]}]}
{:tag :url, :attrs nil, :content [
{:tag :loc, :attrs nil, :content ["https://bellroy.com/corporate-gifting"]}
@duanebester
duanebester / scraper.deps.edn
Created October 27, 2021 20:41
Web Scraping w/Clojure deps
{:paths ["src"]
:deps {etaoin/etaoin {:mvn/version "0.4.6"}
com.taoensso/carmine {:mvn/version "3.1.0"}
congomongo/congomongo {:mvn/version "2.2.3"}
funcool/cuerdas {:mvn/version "2021.05.29-0"}
org.clojure/data.json {:mvn/version "2.4.0"}
org.clojure/core.async {:mvn/version "1.3.618"}}}
@duanebester
duanebester / sitemap.xml
Last active October 28, 2021 14:52
Bellroy sitemap example
<url>
<loc>https://bellroy.com/products/pod-jacket</loc>
<lastmod>2021-10-26</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://bellroy.com/journal/the-art-and-science-of-the-aha-moment</loc>
<lastmod>2021-10-06</lastmod>
<changefreq>daily</changefreq>
@duanebester
duanebester / binf-cstrings.cljc
Created September 29, 2021 00:10
BinF C-String helpers
(defn wr-cstring [view string]
(binf/wr-string view (str string "\0"))
view)
(defn rr-cstring [view]
(loop [index (binf/position view) acc []]
(let [b (binf/rr-u8 view)]
(if (zero? b)
(.toString
(.decode binf.string/decoder-utf-8
@duanebester
duanebester / clunk-docker-compose.yml
Created September 26, 2021 21:09
Test Postgres DB for Clunk
version: '3.9'
services:
postgres:
image: postgres:13
ports:
- 5432:5432
environment:
POSTGRES_USER: jimmy
POSTGRES_PASSWORD: banana
POSTGRES_DB: world
@duanebester
duanebester / parse-header-read-message.clj
Created September 26, 2021 20:43
Parse backend message header and read entire message
(defn- parse-header [bb]
(let [_ (.flip bb)
tag (.get bb)
len (.getInt bb)]
(log/info (str "RECEIVED TAG: " tag ", LENGTH: " len))
[tag len]))
(defn- read-message [tag len client]
(let [bb (ByteBuffer/allocate (+ 1 len)) ;; Allocate for full message
_ (.put bb tag) ;; Add back tag
@duanebester
duanebester / message-socket-imports.clj
Created August 25, 2021 18:43
MessageSocket class and imports
(ns com.clunk.message-socket
(:require [clojure.core.match :as m]
[clojure.core.async :as async]
[octet.core :as buf]
[com.clunk.pw :as pw]
[com.clunk.codecs :as codecs]
[com.clunk.buffer-socket :as bs]))
(defrecord MessageSocket [buffer-socket in-ch out-ch])
@duanebester
duanebester / buffer-socket-example.clj
Last active September 26, 2021 20:56
Example on using the Buffer Socket directly
;; Utility
(defn print-ints
"Prints byte array as ints"
[ba]
(println (map #(int %) ba)))
;; Get socket connection
(def buffer-socket (get-buffer-socket 5432 "localhost"))
;; Startup message for user: "jimmy", and database: "world"