Skip to content

Instantly share code, notes, and snippets.

View max-mapper's full-sized avatar
🔰
✌( ͡ᵔ ͜ʖ ͡ᵔ )✌

Max max-mapper

🔰
✌( ͡ᵔ ͜ʖ ͡ᵔ )✌
View GitHub Profile
@max-mapper
max-mapper / index.js
Last active August 9, 2017 02:12
crossref paginated works metadata streaming archiver
var request = require('request')
var base = 'https://api.crossref.org/works?filter=type:dataset&rows=1000'
doNext()
function doNext (cursor) {
if (!cursor) cursor = '*'
var url = base + '&cursor=' + cursor
console.error('GET', url)
@max-mapper
max-mapper / data.csv
Last active January 9, 2019 16:31
eLife dataset DOIs (specifically linked as 'external datasets' from the paper, most of which live in external repositories)
We can make this file beautiful and searchable if this error is corrected: It looks like row 10 should actually have 4 columns, instead of 3 in line 9.
doi,id,type,url
10.7554/eLife.00007,dataro1,generated-dataset,http://dx.doi.org/10.5061/dryad.gs45f
10.7554/eLife.00048,dataro1,generated-dataset,http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE40298
10.7554/eLife.00049,dataro1,generated-dataset,http://www.ncbi.nlm.nih.gov/genbank/
10.7554/eLife.00065,dataro1,generated-dataset,http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE39313http://www.ncbi.nlm.nih.gov/geo/
10.7554/eLife.00170,dataro1,generated-dataset,http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE41937http://eisenlab.org/data/TAF7Lhttp://www.ncbi.nlm.nih.gov/geo/
10.7554/eLife.00170,dataro2,generated-dataset,http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE21365http://www.ncbi.nlm.nih.gov/geo/
10.7554/eLife.00170,dataro3,generated-dataset,http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE27450http://www.ncbi.nlm.nih.gov/geo/
10.7554/eLife.00170,dataro4,generated-dataset,http://trace.ddbj.nig.ac.jp/DRASearch/study?acc=DRP000383http://www.ddbj.nig.ac.jp/
10.7554/eLife.00184,dataro1,
@max-mapper
max-mapper / urls.txt
Created August 8, 2017 22:25
Scihub DOI resolved URLs by frequency
15030476 linkinghub.elsevier.com
9053559 link.springer.com
7949710 doi.wiley.com
3749242 ieeexplore.ieee.org
3468507 www.tandfonline.com
2005530 academic.oup.com
2000344 www.jstor.org
1662232 content.wkhealth.com
1498775 www.degruyter.com
1438236 pubs.acs.org
@max-mapper
max-mapper / csv.js
Last active August 1, 2018 16:31
streaming merge sort of two line delimited files (csv and json lines)
// output of above script pipe into here, converts it to smaller csv
var split = require('split2')
var through = require('through2')
console.log('doi,url')
var splitter = split()
var each = through(function (buf, enc, next) {
var _
try {
@max-mapper
max-mapper / download.js
Last active August 1, 2018 16:31
data.gov metadata downloader
var request = require('request')
var link = process.argv[2]
var start = process.argv[3]
if (!start) start = 0
else start = +start
dl(link, start, function (err) {
if (err) throw err
console.error('All done')
})
@max-mapper
max-mapper / links.json
Last active November 11, 2023 03:11
all wikipedia zim torrent links
@max-mapper
max-mapper / bibtex.png
Last active November 6, 2024 09:03
How to make a scientific looking PDF from markdown (with bibliography)
bibtex.png
@max-mapper
max-mapper / epaurls.txt
Created April 24, 2017 16:42
epa edg metadata
This file has been truncated, but you can view the full file.
639 "https://www.epa.gov/enviroatlas/enviroatlas-data"
276 "https://enviroatlas.epa.gov/arcgis/rest/services"
194 "https://enviroatlas.epa.gov/arcgis/rest/services/Communities"
142 "https://edg.epa.gov/metadata/"
118 "https://www3.epa.gov/enviro/html/fii/downloads/state_files/Facility%20State%20File%20Documentation.pdf"
54 "http://www.epa.gov/geospatial/"
41 "https://edg.epa.gov/clipship/"
34 "https://epa.maps.arcgis.com/home/webmap/viewer.html?&url=https%3A%2F%2Fenviroatlas.epa.gov%2Farcgis%2Frest%2Fservices%2FSupplemental%2FConnectivity_AllCommunities%2FMapServer"
34 "https://enviroatlas.epa.gov/arcgis/rest/services/Supplemental/Connectivity_AllCommunities/MapServer/kml/mapImage.kmz"
34 "https://enviroatlas.epa.gov/arcgis/rest/services/Supplemental/Connectivity_AllCommunities/MapServer?f=nmf"
@max-mapper
max-mapper / index.js
Created April 18, 2017 23:25
imessage sqlite export
// modified version of http://va2577.github.io/post/51/ to produce ndjson
const sqlite3 = require('sqlite3').verbose();
const db = new sqlite3.Database('../imessage.sqlite')
const Iconv = require('iconv').Iconv;
const sjis = new Iconv('UTF-8', 'Shift_JIS//TRANSLIT//IGNORE');
const filename = './sms.csv';
db.serialize(() => {
const sql = [];
@max-mapper
max-mapper / readme.md
Created April 10, 2017 20:52
data.gov web mirror

proposal 1

CKAN mirror

deploy a full fledged ckan mirror as a live database backed web service and load it up with all the data.gov metadata

pros

  • could mirror the data.gov functionality completely, including all the rendered views
  • brings the full ckan feature set including search and user accounts and plugins