Skip to content

Instantly share code, notes, and snippets.

@scotthaleen
Created November 5, 2016 19:28
Show Gist options
  • Save scotthaleen/001bee68ac5665a595a3a8b483549536 to your computer and use it in GitHub Desktop.
Save scotthaleen/001bee68ac5665a595a3a8b483549536 to your computer and use it in GitHub Desktop.
Flatten HTML
;;include [clj-tagsoup "0.3.0"]
(require '[pl.danieljanus.tagsoup :as tags])
;; each node's structure is as such
;; [:tag attr-map node node ....]
(defn flatten [x]
(tree-seq (comp keyword? first) (partial drop 2) x))
(defn get-url
[url]
(let [x (tags/parse url)]
(-> x
xml-seq
first)))
#find all anchor tags
(filter (comp (partial = :a) first)
(flatten (get-url "http://www.google.com")))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment