-
-
Save jcromartie/3296303 to your computer and use it in GitHub Desktop.
random state and words generator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns honeycomb.core | |
(:gen-class) | |
(:require [clojure.string :as string] | |
[clojure.java.io :as io])) | |
;; this is basically an iterative approach, which avoids building up lazy sequences. there | |
;; would be almost no overhead here in terms of memory or objects | |
(defn -main-simple | |
"Prints n random rows from given filenames. All command-line arguments are strings." | |
[n-rows & filenames] | |
(let [n (Integer/parseInt n-rows) | |
;; the use of memoize here means we won't read a repeated | |
;; filename more than once, so we can just map over the filenames | |
datasets (map (memoize #(line-seq (io/reader %))) filenames)] | |
(dotimes [_ n] | |
(println (string/join "," (map rand-nth datasets))))) | |
;; this version uses more lazy sequences, which will not consume any extra memory either, but which | |
;; will be built up as they go | |
(defn -main-lazy | |
"Prints n random rows from given filenames. All command-line arguments are strings." | |
[n-rows & filenames] | |
(let [n (Integer/parseInt n-rows) | |
;; the use of memoize here means we won't read a repeated | |
;; filename more than once, so we can just map over the filenames | |
datasets (map (memoize #(line-seq (io/reader %))) filenames) | |
;; infinite lazy sequence of random data | |
rows (repeatedly #(map rand-nth datasets))] | |
(doseq [row (take n rows)] | |
(println (string/join "," row))))) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment