Skip to content

Instantly share code, notes, and snippets.

@robinkraft
Created October 30, 2012 20:18
Show Gist options
  • Select an option

  • Save robinkraft/3982728 to your computer and use it in GitHub Desktop.

Select an option

Save robinkraft/3982728 to your computer and use it in GitHub Desktop.
aggregating GBIF occurrence data by scientific name
;; see https://github.com/MapofLife/gbifer
;; use with https://s3.amazonaws.com/gbifsource/occ.txt
(use 'gulo.gbif)
(in-ns 'gulo.gbif)
(defn read-occurrences
([]
(read-occurrences "/mnt/hgfs/Dropbox/code/github/MapofLife/gbifer/resources/gbif/occ.txt"))
([path]
(let [src (hfs-textline path)]
(<- [?scientificname ?occurrenceid ?latitude ?longitude]
(src ?line)
(split-line ?line :>> occ-fields)))))
(defbufferop wide-format
"Aggregate occurrence id and lat/lon into wide format, from long"
[tuples]
(let [occ-ids (map first tuples)
lats (map #(nth % 1) tuples)
lons (map #(nth % 2) tuples)]
[[occ-ids [lats lons]]]))
(let [occ-src (read-occurrences)]
(??<- [?sci-name ?occ-ids ?latlons]
(occ-src ?sci-name ?occ-id ?lat-str ?lon-str)
(valid-name? ?sci-name)
(latlon-valid? ?lat-str ?lon-str)
(wide-format ?occ-id ?lat-str ?lon-str :> ?occ-ids ?latlons)))
;; results in this:
(["Acidobacteria" ("242135095" "244666043" "244664083" "244662123" "244661391" "244662763" "242136127" "242135147" "242135541" "244661001")
[("9.93166666667" "47.166668" "-40.8747" "15.509722" "73.967" "-43.288" "31.41666667" "41.526565" "41.576233" "53.49027")
("0.896666666667" "19.583334" "170.851" "73.88306" "-140.088" "-175.5532" "35.41666667" "-70.670762" "-70.6336" "6.144433")]])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment