philshem’s gists

philshem / stackexchange_tag_usage.csv

Created March 28, 2019 16:38

tag counts for all stackexchange network sites

We can't make this file beautiful and searchable because it's too large.

	url,tagname,tagcount
	"http://3dprinting.StackExchange.com/tags/101-hero\|3dprinting","101-hero","1"
	"http://3dprinting.StackExchange.com/tags/123d-catch\|3dprinting","123d-catch","2"
	"http://3dprinting.StackExchange.com/tags/2d\|3dprinting","2d","4"
	"http://3dprinting.StackExchange.com/tags/3d-design\|3dprinting","3d-design","131"
	"http://3dprinting.StackExchange.com/tags/3d-models\|3dprinting","3d-models","152"
	"http://3dprinting.StackExchange.com/tags/3d-pen\|3dprinting","3d-pen","1"
	"http://3dprinting.StackExchange.com/tags/3d-printerworks\|3dprinting","3d-printerworks","1"
	"http://3dprinting.StackExchange.com/tags/3dtouch\|3dprinting","3dtouch","2"
	"http://3dprinting.StackExchange.com/tags/abs\|3dprinting","abs","66"

philshem / follower_count_201904.csv

Last active April 7, 2019 13:55

twitter follower counts for swiss media (04.2019)

philshem / clean_AcronymsCSV.py

Created April 13, 2019 15:04

cleans acronym list from https://github.com/krishnakt031990/Crawl-Wiki-For-Acronyms

	#!/usr/bin/env python
	# coding=utf-8

	ignore_list = ('Search-Navigation','Tools-What links','Top-','Contents','Magyar')
	with open('AcronymsFile.csv','r') as inp:
	data = inp.read().split('\n')

	with open('clean_AcronymsFile.csv','w') as out:
	out.write('acronym'+'\t'+'definition'+'\n')

philshem / clean_AcronymFile.csv

Last active August 10, 2022 05:59

cleanup script and csv file (needs some cleaning) based on https://github.com/krishnakt031990/Crawl-Wiki-For-Acronyms

We can make this file beautiful and searchable if this error is corrected: It looks like row 7 should actually have 1 column, instead of 2 in line 6.

	acronym definition
	0D Zero-dimensional
	1AM Air mechanic 1st class
	1D One-dimensional
	2AM Air mechanic 2nd class
	2D Two-dimensional
	2G Second-generation mobile (cellular, wireless) telephone system
	2LA Two letter acronym
	2Lt 2nd lieutenant
	3AM Air mechanic 3rd class

philshem / get_top500_favicons.py

Created April 16, 2019 16:27

Download top500 favicons from csv

	import requests
	import pandas as pd
	import os
	from io import StringIO

	def request_function(domain):
	domain = domain.replace('/','')
	url = 'https://www.google.com/s2/favicons?domain=' + domain
	fav = requests.get(url).content
	with open('images'+os.sep+domain+'.png', 'wb') as handler:

philshem / CoS4 Reading List.pdf

Last active August 25, 2020 19:04 — forked from remarkablemark/how-to-add-image-to-gist.md

How to add an image to a gist: https://remarkablemark.org/blog/2016/06/16/how-to-add-image-to-gist/

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

philshem / cadima_clean_metadata.py

Last active May 21, 2019 09:24

Python3 script to clean non-ascii characters from the PDF "Title" metadata field.

	# requires python3.x and one non-standard module `pip install pdfrw`
	# pdfs should be in folder relative to this code, named `pdfs`

	import os
	from pdfrw import PdfReader, PdfWriter
	from glob import glob
	import unicodedata

	def edit_title_metadata(inpdf):

philshem / get_nobel_prize_years_until_death.py

Last active October 8, 2019 16:32

Nobel prize winners and their years until death.

	#/usr/bin/python3

	# gets demographics for nobel prize winners
	# calculates yearly average of how many years between prize and death

	import pandas as pd
	import numpy as np

	# api endpoint for all nobel winnters: https://nobelprize.readme.io/
	url = 'http://api.nobelprize.org/v1/laureate.csv'

philshem / swiss_housing_dataviz.ipynb

Created November 3, 2019 19:22

swiss_housing_dataviz.ipynb

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

philshem / scrape_bee.py

Last active November 4, 2019 11:42

NYTimes Spelling Bee scraper 🐝☠️

	#!/usr/bin/env python3

	import requests
	from bs4 import BeautifulSoup
	import json

	def main():

	# the answers are stored as a json inside the page source
	url = 'https://www.nytimes.com/puzzles/spelling-bee'

	url,tagname,tagcount
	"http://3dprinting.StackExchange.com/tags/101-hero\|3dprinting","101-hero","1"
	"http://3dprinting.StackExchange.com/tags/123d-catch\|3dprinting","123d-catch","2"
	"http://3dprinting.StackExchange.com/tags/2d\|3dprinting","2d","4"
	"http://3dprinting.StackExchange.com/tags/3d-design\|3dprinting","3d-design","131"
	"http://3dprinting.StackExchange.com/tags/3d-models\|3dprinting","3d-models","152"
	"http://3dprinting.StackExchange.com/tags/3d-pen\|3dprinting","3d-pen","1"
	"http://3dprinting.StackExchange.com/tags/3d-printerworks\|3dprinting","3d-printerworks","1"
	"http://3dprinting.StackExchange.com/tags/3dtouch\|3dprinting","3dtouch","2"
	"http://3dprinting.StackExchange.com/tags/abs\|3dprinting","abs","66"

@philshem philshem