Skip to content

Instantly share code, notes, and snippets.

View dsalaj's full-sized avatar
🐙

Darjan Salaj dsalaj

🐙
View GitHub Profile
@dsalaj
dsalaj / np_tolerant_mean.py
Created November 13, 2019 08:06
Calculate mean of list of arrays with different lengths (useful for plotting progress of incomplete simulation runs)
import numpy as np
x = [1, 2, 3.5, 4]
y = [1, 2, 3, 3, 4, 5, 3]
z = [7, 8]
arrs = [x, y, z]
def tolerant_mean(arrs):
# arrs = [x, y, z]
lens = [len(i) for i in arrs]
@dsalaj
dsalaj / keybase.md
Created November 30, 2019 21:39
keybase.md

Keybase proof

I hereby claim:

  • I am dsalaj on github.
  • I am dsalaj (https://keybase.io/dsalaj) on keybase.
  • I have a public key ASBBDtsHlfUOlqCUF48dL0qkNY-lWwrLC2dbOWrHjNMYrwo

To claim this, I am signing this object:

@dsalaj
dsalaj / tf_ds_from_parametrized_generator.py
Created February 7, 2020 09:47
Example of tf.data.Dataset.from_generator usage with parametrized generator
import tensorflow as tf
x_train = [i for i in range(0, 20, 2)] # even
x_val = [i for i in range(1, 20, 2)] # odd
y_train = [i**2 for i in x_train] # squared
y_val = [i**2 for i in x_val]
def gen_data_epoch(test=False): # parametrized generator
train_data = x_val if test else x_train
label_data = y_val if test else y_train
@dsalaj
dsalaj / slurmjob.sh
Created February 21, 2020 09:15
Example of slurm job script. Start with: sbatch slurmjob.sh
#!/bin/bash
#SBATCH --job-name=GSC # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=salaj.au@gmail.com # Where to send mail
#SBATCH --output=slurm_out_%j.log # Standard output and error log
#SBATCH --nodes=1
#SBATCH --exclusive
#SBATCH --partition=IGIcrunchers
conda activate venv2
@dsalaj
dsalaj / tf_dataset_split_util.py
Created March 23, 2020 18:05
Different ways of splitting tensorflow dataset
def split_dataset(ds, version=1):
if version == 1:
train_ds = ds.dataset.shard(num_shards=4, index=0)
train_ds.concatenate(ds.dataset.shard(num_shards=4, index=1))
train_ds.concatenate(ds.dataset.shard(num_shards=4, index=2))
valid_ds = ds.dataset.shard(num_shards=4, index=3)
return train_ds, valid_ds
elif version == 2:
def is_val(x, y):
@dsalaj
dsalaj / pyspark_cheatsheet.py
Created June 26, 2020 08:48
Cheatsheet for pyspark
# filter with strings
df.filter(df.name.endswith('ice')).collect()
# [Row(age=2, name='Alice')]
# order with null values at the end
df.select(df.name).orderBy(df.name.desc_nulls_last()).collect()
# [Row(name='Tom'), Row(name='Alice'), Row(name=None)]
# filter by null
df.filter(df.height.isNotNull()).collect()