radovankavicky / post-save-hook.py

Last active June 8, 2017 14:51 — forked from jbwhit/post-save-hook.py

Saves Jupyter Notebooks as .py and .html files automatically. Add to the ipython_notebook_config.py file of your associated profile.

	import os
	from subprocess import check_call

	def post_save(model, os_path, contents_manager):
	"""post-save hook for converting notebooks to .py and .html files."""
	if model['type'] != 'notebook':
	return # only do this for notebooks
	d, fname = os.path.split(os_path)
	check_call(['jupyter', 'nbconvert', '--to', 'script', fname], cwd=d)
	check_call(['jupyter', 'nbconvert', '--to', 'html', fname], cwd=d)

radovankavicky / PyData Berlin 2017.md

Last active June 8, 2017 14:49

Gist for PyData Berlin 2017

2017 PyData Berlin

Data Science & Data Visualization in Python. How to harness power of Python for social good?

Radovan Kavický
President & Principal Data Scientist, GapData Institute
@radovankavicky

Watch the talk on YouTube here:

radovankavicky / japan_pyramid.R

Created September 25, 2017 15:27 — forked from walkerke/japan_pyramid.R

	library(idbr) # devtools::install_github('walkerke/idbr')
	library(ggplot2)
	library(animation)
	library(dplyr)
	library(ggthemes)

	idb_api_key("Your Census API key goes here")

	male <- idb1('JA', 2010:2050, sex = 'male') %>%
	mutate(POP = POP * -1,

radovankavicky / tweet_listener.py

Created October 19, 2017 12:56 — forked from hugobowne/tweet_listener.py

Here I define a Tweet listener that creates a file called 'tweets.txt', collects streaming tweets as .jsons and writes them to the file 'tweets.txt'; once 100 tweets have been streamed, the listener closes the file and stops listening.

	class MyStreamListener(tweepy.StreamListener):
	def __init__(self, api=None):
	super(MyStreamListener, self).__init__()
	self.num_tweets = 0
	self.file = open("tweets.txt", "w")

	def on_status(self, status):
	tweet = status._json
	self.file.write( json.dumps(tweet) + '\n' )
	self.num_tweets += 1

radovankavicky / gg_tweet.R

Created October 23, 2017 18:29 — forked from hrbrmstr/gg_tweet.R

use the magick device to make ggplots conform to twitter card or in-stream image optimal sizes, with or without "retina" resolution

	library(httr)
	library(magick)
	library(hrbrthemes)
	library(ggplot2)

	theme_tweet_rc <- function(grid = "XY", style = c("stream", "card"), retina=FALSE) {

	style <- match.arg(tolower(style), c("stream", "card"))

	switch(

radovankavicky / useful_pandas_snippets.py

Created December 1, 2017 11:21 — forked from bsweger/useful_pandas_snippets.md

Useful Pandas Snippets

	# List unique values in a DataFrame column
	# h/t @makmanalp for the updated syntax!
	df['Column Name'].unique()

	# Convert Series datatype to numeric (will error if column has non-numeric values)
	# h/t @makmanalp
	pd.to_numeric(df['Column Name'])

	# Convert Series datatype to numeric, changing non-numeric values to NaN
	# h/t @makmanalp for the updated syntax!

radovankavicky / ggplotrbokeh.R

Created December 5, 2017 23:04 — forked from hrbrmstr/ggplotrbokeh.R

ggplot <-> rbokeh

	library(ggplot2)
	library(rbokeh)
	library(htmlwidgets)

	structure(list(wk = structure(c(16069, 16237, 16244, 16251, 16279,
	16286, 16300, 16307, 16314, 16321, 16328, 16335, 16342, 16349,
	16356, 16363, 16377, 16384, 16391, 16398, 16412, 16419, 16426,
	16440, 16447, 16454, 16468, 16475, 16496, 16503, 16510, 16517,
	16524, 16538, 16552, 16559, 16566, 16573), class = "Date"), n = c(1L,
	1L, 1L, 1L, 3L, 1L, 3L, 2L, 4L, 2L, 3L, 2L, 5L, 5L, 1L, 1L, 3L,

radovankavicky / python-RData.py

Created December 18, 2017 13:51 — forked from LeiG/python-RData.py

Python and .RData files

	import rpy2.robjects as robjects
	import pandas.rpy.common as com
	import pandas as pd

	## load .RData and converts to pd.DataFrame
	robj = robjects.r.load('test.RData')
	# iterate over datasets the file
	for sets in robj:
	myRData = com.load_data(sets)
	# convert to DataFrame

radovankavicky / johnsnow_dataset_pumps_deaths_semi_xycombined.csv

Created December 20, 2017 12:42

Cholera Deaths and Pumps information from John Snow's 1854 map of the cholera outbreak in London. Each row represents a location (given in the geometry field) of either a Pump or a Death. The value given in the Count field is either: -999 for a pump > 0: the number of deaths at that location This has been imported from the shapefiles available at …

We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 2 in line 1.

	Number of deaths;XY coordinates
	3;-0.13793,51.513418
	2;-0.137883,51.513361
	1;-0.137853,51.513317
	1;-0.137812,51.513262
	4;-0.137767,51.513204
	2;-0.137537,51.513184
	2;-0.1382,51.513359
	2;-0.138045,51.513328
	3;-0.138276,51.513323

radovankavicky / johnsnow_dataset_pumps_deaths_comma_xydivided.csv

Created December 20, 2017 12:46

Cholera Deaths and Pumps information from John Snow's 1854 map of the cholera outbreak in London. Each row represents a location (given in the geometry field) of either a Pump or a Death. The value given in the Count field is either: -999 for a pump > 0: the number of deaths at that location This has been imported from the shapefiles available at …

Number of deaths	X coordinate	Y coordinate
3	-0.13793	51.513418
2	-0.137883	51.513361
1	-0.137853	51.513317
1	-0.137812	51.513262
4	-0.137767	51.513204
2	-0.137537	51.513184
2	-0.1382	51.513359
2	-0.138045	51.513328
3	-0.138276	51.513323

Radovan Kavicky radovankavicky

2017 PyData Berlin

Data Science & Data Visualization in Python. How to harness power of Python for social good?

Watch the talk on YouTube here: