Skip to content

Instantly share code, notes, and snippets.

@danstowell
danstowell / trim_audio_dataset.py
Last active August 6, 2024 10:18
Python script to enforce a maximum duration on a set of audio files. Long files are trimmed, without altering the sample data. The script can either convert all to FLAC, or preserve exact container type for each file.
#!/bin/env python
# Script to enforce a maximum duration on a set of audio files.
# Long files are trimmed, without altering their file format.
# Uses ffmpeg. (So, that needs to be installed on your system).
# Tested in Ubuntu Linux v22.04.
# Written by Dan Stowell 2024.
# CC0: This work has been marked as dedicated to the public domain.
import os
import sys
import imageio
import numpy as np
import librosa as lr
import matplotlib.pyplot as plt
from tqdm import tqdm
@danstowell
danstowell / bootstrap_example.py
Created July 28, 2022 06:44
example of bootstrap sampling to estimate confidence intervals on an accuracy measure
# example of bootstrap sampling to estimate confidence intervals on an accuracy measure
import numpy as np
nbootstraps = 500 # 50 # 500 # note that 50 is fast enough for development purposes, but I use 500 for final evaluation
# here's a VERY SHORT list of outcomes, each one reflecting whether sound X was correctly predicted or not
outcomes = np.array([1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0])
@danstowell
danstowell / gbifparakeet.py
Created October 25, 2019 13:49
Plot GBIF parakeet observation data
import os, sys, csv
import numpy as np
import pandas as pd
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
import imageio
###################################################################################
# simple audio sync example by Dan Stowell Nov 2018
import librosa # lib... Rosa!
import os
import numpy as np
###############################################
maxlagsecs = 10 # the maximum offset between two audio files that will be considered
@danstowell
danstowell / ReservoirSampleMatrix.py
Created April 26, 2018 09:38
Algorithm to sample random rows from a matrix that is too large to store in memory (Reservoir Random Sampling, Algorithm "R") https://en.wikipedia.org/wiki/Reservoir_sampling#Algorithm_R
# by Dan Stowell 2014.
# This particular file implementing ReservoirSampleMatrix is published under the MIT Public Licence
# http://opensource.org/licenses/MIT
import numpy as np
import random
class ReservoirSampleMatrix:
"""This class helps you to take a uniform random sample from streaming data, using the 'Reservoir Sampling' technique.
import numpy as np
from numpy import log
# we generate two independent "parent" processes, then for each one we independently create thinnings.
# we aim to have a distance measure that correctly clusters the two sets according to their parent.
# by Dan Stowell, copyright Feb 2018
parents = [np.cumsum(np.random.exponential(size=100)) for whichp in range(2)]
@danstowell
danstowell / woodland_falsecolour.ipynb
Created October 16, 2017 14:11
Quick hack at a kind of false-colour long-duration spectrogram inspired by Towsey et al http://dx.doi.org/10.1016/j.procs.2014.05.063 - incomplete
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@danstowell
danstowell / wikidata_scan_osm.py
Created March 22, 2017 22:27
Script to build Wikidata -> OpenStreetMap lookup table
import os, sys, re
from datetime import datetime
from imposm.parser import OSMParser
########################################################
#osmsourcelbl = 'greater-london'
#osmsourcelbl = 'great-britain'
osmsourcelbl = 'planet'
@danstowell
danstowell / schroeder.scd
Created February 2, 2016 14:52
Generate Schroeder-phase complexes in Supercollider
// schroeder-phase waveforms
s.boot
s.scope
// we'll do the simple flat-spectra case
// (The paper I saw this version in is "Phase effects on the perceived elevation of complex tones", <http://dx.doi.org/10.1121/1.3372753>.)
~nb = 5;
~nt = 250;