Skip to content

Instantly share code, notes, and snippets.

View mariusvniekerk's full-sized avatar

Marius van Niekerk mariusvniekerk

View GitHub Profile
@mariusvniekerk
mariusvniekerk / readme.md
Created November 29, 2017 15:05
binderhub-test-public

This is a test gist for public binderhub gists

@joshlk
joshlk / faster_toPandas.py
Last active July 22, 2024 14:15
PySpark faster toPandas using mapPartitions
import pandas as pd
def _map_to_pandas(rdds):
""" Needs to be here due to pickling issues """
return [pd.DataFrame(list(rdds))]
def toPandas(df, n_partitions=None):
"""
Returns the contents of `df` as a local `pandas.DataFrame` in a speedy fashion. The DataFrame is
repartitioned if `n_partitions` is passed.
@bradrydzewski
bradrydzewski / generate_docker_cert.sh
Last active May 27, 2024 15:59
Generate trusted CA certificates for running Docker with HTTPS
#!/bin/bash
#
# Generates client and server certificates used to enable HTTPS
# remote authentication to a Docker daemon.
#
# See http://docs.docker.com/articles/https/
#
# To start the Docker Daemon:
#
# sudo docker -d \