Skip to content

Instantly share code, notes, and snippets.

@monogenea
Last active April 6, 2020 05:55
Show Gist options
  • Save monogenea/9116b2e0e23e240b9362d8f979a58985 to your computer and use it in GitHub Desktop.
Save monogenea/9116b2e0e23e240b9362d8f979a58985 to your computer and use it in GitHub Desktop.
# Encode species from fnames regex
species <- str_extract(fnames, patt = "[A-Za-z]+-[a-z]+") %>%
gsub(patt = "-", rep = " ") %>% factor()
# Stratified sampling: train (80%), val (10%) and test (10%)
set.seed(100)
idx <- createFolds(species, k = 10)
valIdx <- idx$Fold01
testIdx <- idx$Fold02
# Define samples for train, val and test
fnamesTrain <- fnames[-c(valIdx, testIdx)]
fnamesVal <- fnames[valIdx]
fnamesTest <- fnames[testIdx]
# Take multiple readings per sample for training
Xtrain <- audioProcess(files = fnamesTrain, ncores = 5,
limit = 20, ws = 10, stride = 5)
Xval <- audioProcess(files = fnamesVal, ncores = 5,
limit = 20, ws = 10, stride = 5)
Xtest <- audioProcess(files = fnamesTest, ncores = 5,
limit = 20, ws = 10, stride = 5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment