Skip to content

Instantly share code, notes, and snippets.

View daskol's full-sized avatar
🤙

Daniel Bershatsky daskol

🤙
View GitHub Profile
@daskol
daskol / gensitemap.py
Last active July 11, 2019 22:16
Simple script to generate sitemap for a specified domain.
#!/usr/bin/env python3
# encoding: utf8
# filename: gensitemap.py
"""Simple script to generate sitemap for a specified domain. It traverse all
HTML files from a given root directory and build URLs.
"""
import logging
import xml.etree.ElementTree as etree
@daskol
daskol / fix-nbcounter.py
Created July 23, 2019 16:42
Fix execution counter of Jupyter notebook.
#!/usr/bin/env python3
# encoding: utf8
# filename: fix-nbcounter.py
"""This simple script load Jupyter notebook and fix execution counter.
"""
from json import load, dump
from sys import argv
with open(argv[1]) as fin:
@daskol
daskol / mnist-with-jax.ipynb
Created August 9, 2019 09:27
Fix MNIST with Jax
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
#!/usr/bin/env bash
# download.sh
declare -a archives=(
"https://data-static.usercontent.dev/DataClusteringSample0107.tar.gz"
"https://data-static.usercontent.dev/DataClusteringSample0817.tar.gz"
"https://data-static.usercontent.dev/DataClusteringSample1821.tar.gz"
)
for archive in ${archives[@]}; do
@daskol
daskol / PKGBUILD
Created July 12, 2020 01:13
AUR Package for Apache Arrow
# Maintainer: Daniel Bershatsky <[email protected]>
pkgname=apache-arrow
pkgver=0.17.1
pkgrel=1
pkgdesc="Language-independent columnar memory format for flat and hierarchical data"
arch=('x86_64')
url='https://arrow.apache.org/'
license=('Apache')
depends=('boost-libs' 'jemalloc')
@daskol
daskol / tensorflow_dataset_unzip.py
Created March 8, 2021 19:18
Unzip a single input TensorFlow dataset to multiple ones.
from typing import Tuple
def unzip_dataset(dataset) -> Tuple[tf.data.Dataset, ...]:
"""Function unzip_dataset takes a single dataset which element spec is a
tuple and returns multiple datasets by number of tuple size of element spec.
:param dataset: Input dataset.
:return: Output tuple of datasets.
"""
@daskol
daskol / wait4pid.py
Last active June 17, 2021 15:47
Event multiplexing on process identifiers (PIDs)
#!/usr/bin/env python3
"""This script demonstrates event multiplexing on process identifiers (PIDs).
More specifically, we issue file descriptor (FD) from PID and. Then we wait
events on the descriptor. Note that polling such events is feasible because of
obtaining FD from PID with pidfd_open() system call. This feature was
introduced in 5.3 (Sep 2019). A usage example is below.
$ ./wait4pid.py 1337 &
$ kill -9 1337
process with pid 1337 exited
@daskol
daskol / jax-free.py
Created August 12, 2021 14:26
Free JAX/XLA buffers of size exceeded threshold
import jax
def collect(threshold=256 * 1024 ** 2):
backend = jax.lib.xla_bridge.get_backend()
freed = 0
for buf in backend.live_buffers():
if buf.nbytes >= threshold:
buf += buf.nbytes
buf.delete()
@daskol
daskol / pytorch-graph-summary.py
Created December 3, 2021 15:49
Add PyTorch graph summary to TensorBoard.
import torch as T
model = T.nn.Sequential()
model.add_module('W0', T.nn.Linear(128, 10))
model.add_module('tanh', T.nn.Tanh())
model.add_module('W1', T.nn.Linear(10, 5))
model.add_module('tanh', T.nn.Tanh())
model.add_module('W2', T.nn.Linear(5, 1))
input = T.randn(16, 128)
@daskol
daskol / visualise-backward-pass.py
Last active December 3, 2021 20:36
Visualization of backward pass graph for simple model in PyTorch
#!/usr/bin/evn python3
# Run this script and then the command below.
#
# dot -Tpng -ograph.png graph.dot
#
import torch as T
from torchviz import make_dot
model = T.nn.Sequential(