Skip to content

Instantly share code, notes, and snippets.

View andersy005's full-sized avatar
:octocat:

Anderson Banihirwe andersy005

:octocat:
View GitHub Profile
@4rzael
4rzael / main.md
Last active February 14, 2026 13:33
GIS with pySpark.
NOTE : Take a look at the comments below !

GIS with pySpark : A not-so-easy journey

Why would you do that ?

Today, many datas are geolocalised (meaning that they have a position in space). They're named GIS datas.

It's not rare that we need to do operations on those, such as aggregations, and there are many optimisations existing to do that.

@crawles
crawles / Spark Dataframe Cheat Sheet.py
Last active December 19, 2025 20:11 — forked from evenv/Spark Dataframe Cheat Sheet.py
Cheat sheet for Spark Dataframes (using Python)
# A simple cheat sheet of Spark Dataframe syntax
# Current for Spark 1.6.1
# import statements
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
#creating dataframes
df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data
from threading import Thread
from time import sleep
import uuid
from dask.distributed import LocalCluster, Client
import dask.dataframe as dd
import pandas as pd
import pyspark
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andyvanee
andyvanee / .ssh_config
Last active November 30, 2023 04:19
Fix unix_listener too long for Unix domain socket
Host *
ControlPath ~/.ssh/control/%C
ControlMaster auto
@betatim
betatim / Kubernetes cluster monitoring (binder-prod)-1516879011895.json
Created January 25, 2018 11:19
State of the grafana.mybinder.org panels.
{
"__inputs": [],
"__requires": [
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "4.6.3"
},
{
@seraku24
seraku24 / 149909-playlist_youtube-vlc3patch.lua
Created May 16, 2018 09:56
VLC 3.x compatibility patch for 149909-playlist_youtube.lua
--[[
Youtube playlist importer for VLC media player 1.1 and 2.0
Copyright 2012 Guillaume Le Maout
Authors: Guillaume Le Maout
Contact: http://addons.videolan.org/messages/?action=newmessage&username=exebetche
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
@zonca
zonca / spawner.py
Last active May 14, 2020 06:15
Batchspawner configuration to launch Jupyter Notebooks on Comet computing nodes
import batchspawner
# The port for this process
c.JupyterHub.hub_port = 8081
# The ip for this process
c.JupyterHub.hub_ip = '127.0.0.1'
class SlurmSpawnerNoLocalUsers(batchspawner.SlurmSpawner):
"""Slurm Spawner that does not need local Unix users on the Hub server"""
@jessfraz
jessfraz / Dockerfile
Created December 28, 2018 22:54
Scrape best papers site
FROM python:2-alpine
RUN pip install \
beautifulsoup4 \
requests
COPY papers.py /usr/local/bin/
RUN chmod +x /usr/local/bin/papers.py
WORKDIR /root
@micahhausler
micahhausler / jq-filter.sh
Last active September 21, 2023 07:27
GitHub collaborator finder
# Go to https://developer.github.com/v4/explorer/ and enter the graphql query with the query veriable:
# {"queryString": "your-githubuser-name"}
cat results.json |
jq '.data.user.repositories.edges[] | { Count: .node.collaborators.totalCount, Repo: .node.name} | select(.Count > 2)'