Skip to content

Instantly share code, notes, and snippets.

"""
Utility script to visualize embeddings using the tensorboard projector module.
Usage
-----
Dependencies : numpy, pillow, pandas, tensorflow
Call `prepare_projection(embedding, metadata, image_paths, ...)`, where :
- `embedding` is a 2D numpy array (`n_sample` x `dim_embedding`)
@martinsotir
martinsotir / Dockerfile
Last active March 14, 2018 18:48
patchwork_test_for_valera_1
# Instructions (requires docker):
# docker build -t patchwork .
# docker run -it --rm patchwork
FROM openjdk:8
RUN apt-get update
RUN apt-get install apt-transport-https
# Install sbt
import itertools
import pandas as pd
def flatten_df(df, list_col, elem_col_name="elem"):
"""Convert a series of list to individual rows, within a dataframe.
Adapted from https://stackoverflow.com/a/48532692
This function can be used on a dask dataframe:
```python
df.map_partitions(lambda x: flatten_df(x, "list_col", elem_col_name="elem")).clear_divisions()
@martinsotir
martinsotir / geotiff_tiling_intro_with_gdal.ipynb
Created August 2, 2018 05:40
Introduction to GDAL raster tiling
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@martinsotir
martinsotir / imagezipdataset.py
Created January 21, 2019 16:24
ImageZipDataset
import torch
from torch.utils.data import Dataset, DataLoader
import tarfile
import zipfile
from pathlib import Path
from PIL import Image
from tqdm import tqdm
from torchvision import transforms
import mmap
import torch.multiprocessing as mp
@martinsotir
martinsotir / conda_4.6_powershell.md
Last active October 7, 2024 15:34
Enable conda in powershell

Enabling conda in Windows Powershell

First, in an administrator command prompt, enable unrestricted Powershell script execution (see About Execution Policies):

set-executionpolicy unrestricted

Then makes sure that the conda Script directory in is your Path.

@martinsotir
martinsotir / ssh-multi.sh
Created January 27, 2019 20:44 — forked from dmytro/ssh-multi.sh
Start multiple synchronized SSH connections with Tmux
#!/bin/bash
# ssh-multi
# D.Kovalov
# Based on http://linuxpixies.blogspot.jp/2011/06/tmux-copy-mode-and-how-to-control.html
# a script to ssh multiple servers over multiple tmux panes
starttmux() {
if [ -z "$HOSTS" ]; then
@martinsotir
martinsotir / pandas_conditionnal_merge.py
Last active June 7, 2019 22:00
Conditional merge in pandas
import pandas as pd
def join_part(A, B, cond, left_on, right_on):
C = A.merge(B, left_on=left_on, right_on=right_on, how="inner", copy=False)
return C[cond].copy()
def conditional_join(A, B, cond, left_on, right_on, batch_size=50000):
indices = range(0, len(A) + batch_size, batch_size)
batches = (A.iloc[b_start : b_start + batch_size] for b_start in indices)
merges = (join_part(subset, B, cond, left_on, right_on) for subset in batches)
# WIP
# Inspired from Keras and https://towardsdatascience.com/how-to-visualize-convolutional-features-in-40-lines-of-code-70b7d87b0030
from pathlib import Path
import torch
import torchvision.utils as vutils
import matplotlib.pyplot as plt
from torchvision.utils import make_grid
import cv2
import numpy as np

Disk I/O benchmarking

Required packages:

sudo apt install fio ioping smartmontools

List volume and partitions: