Skip to content

Instantly share code, notes, and snippets.

@atroche
Last active June 3, 2017 07:08
Show Gist options
  • Save atroche/7a55126041f1f0a04daa894899c30e3e to your computer and use it in GitHub Desktop.
Save atroche/7a55126041f1f0a04daa894899c30e3e to your computer and use it in GitHub Desktop.
(with-open [file-stream (FileInputStream. ten-gb-filename)]
(let [channel (chan 500)
;; make four workers to read byte arrays off the channel:
counters (for [_ (range 4)]
(go-loop [newline-count 0]
(let [barray (async/<! channel)]
(if (nil? barray) ;; channel is closed
newline-count
(recur (+ newline-count
(count-newlines barray)))))))]
(go-loop []
(let [barray (byte-array one-meg) ;; 1024*1024
bytes-read (.read file-stream barray)]
;; this put will block if there are more than 500MBs waiting in channel
;; so as to not engorge the heap (learnt the hard way)
(>! channel barray)
(if (> bytes-read 0) ;; .read returns a -1 on EOF
(recur) ;; (keep going until EOF)
(close! channel))))
(reduce + (map <!! counters))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment