I hereby claim:
- I am thvasilo on github.
- I am tvas (https://keybase.io/tvas) on keybase.
- I have a public key whose fingerprint is BD7D 432D 4124 630C A4F2 061E 4AA5 5B32 660B 2CB2
To claim this, I am signing this object:
| #!/usr/bin/env bash | |
| set -Eeuo pipefail | |
| trap cleanup SIGINT SIGTERM ERR EXIT | |
| script_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd -P) | |
| usage() { | |
| cat <<EOF | |
| Usage: $(basename "${BASH_SOURCE[0]}") [-h] [-v] [-f] -p param_value arg1 [arg2...] |
| # Script to set up the environment and files for training XGBoost jobs | |
| # on the master of an MPI cluster created using AWS ParallelCluster | |
| # Install personal choice packages | |
| sudo apt install -y tmux emacs-nox htop parallel | |
| # Needed for dmlc-core (?) | |
| sudo apt install -y libcurl4-openssl-dev libssl-dev | |
| # Parallel compress/decompress because we work with large bzipped files |
| import argparse | |
| import multiprocessing as mp | |
| import os | |
| from operator import itemgetter | |
| from collections import Counter | |
| import functools | |
| import json | |
| def parse_args(): |
| ==18350== Invalid free() / delete / delete[] / realloc() | |
| ==18350== at 0x4C2F74B: operator delete[](void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) | |
| ==18350== by 0x469786: datasketches::kll_sketch<float>::~kll_sketch() (kll_sketch.hpp:173) | |
| ==18350== by 0x4A0056: void std::_Destroy<datasketches::kll_sketch<float> >(datasketches::kll_sketch<float>*) (stl_construct.h:93) | |
| ==18350== by 0x494DF3: void std::_Destroy_aux<false>::__destroy<datasketches::kll_sketch<float>*>(datasketches::kll_sketch<float>*, datasketches::kll_sketch<float>*) (stl_construct.h:103) | |
| ==18350== by 0x4890D4: void std::_Destroy<datasketches::kll_sketch<float>*>(datasketches::kll_sketch<float>*, datasketches::kll_sketch<float>*) (stl_construct.h:126) | |
| ==18350== by 0x479AE0: void std::_Destroy<datasketches::kll_sketch<float>*, datasketches::kll_sketch<float> >(datasketches::kll_sketch<float>*, datasketches::kll_sketch<float>*, std::allocator<datasketches::kll_sketch<float> >&) (stl_construct.h:151) | |
| ==18350== by |
| # My situation: I have a bunch of experiments nested under parameter dirs | |
| # 10/ 20/ 30/ ... | |
| # Each experiment dir has some experiment files, trailing _X indicates X repeat of experiment | |
| # specific dataset | |
| # ls 10/ | |
| # dataset1_0.csv dataset2_0.csv dataset1_1.csv dataset2_1.csv | |
| # Problem: I want to rename all the <datasetname>_1.csv files to <datasetname>_2.csv | |
| # Solution: parallel & mmv! | |
| # Use GNU parallel because it has a nicer syntax than bash for loops | |
| parallel -j -q 2 mmv {1}/"*_1.csv" {1}/"#1_2.csv" ::: {10..100..10} |
| # Benchmark for measuring matrix multiplication speed, Martin Nilsson, Rise SICS | |
| # relevant for certain Machine Learning tasks v1.0 2017-11-21 | |
| # v1.1 Theodore Vasiloudis (PyTortch solution) | |
| # ==================================================== | |
| # Run by: | |
| # | |
| # python3 multiplytest.py 10000 | |
| # | |
| # to measure squaring a 10000 x 10000 random matrix. | |
| # Weirdly enough K80 and Titan X get different results prolly something to do with numerical accuracy. |
| ID | EVENT | TIME | x | x.1 | |
|---|---|---|---|---|---|
| 1 | 1 | 110.443671250798 | 0 | 0.88954899716191 | |
| 2 | 1 | 746.21020937277 | 1 | 0.85477636102587 | |
| 3 | 1 | 249.656292624447 | 0 | 1.19875323530287 | |
| 4 | 1 | 76.5375073833034 | 0 | 1.13521479736082 | |
| 5 | 1 | 68.3884146201972 | 1 | 0.866565287671983 | |
| 6 | 1 | 309.475210375677 | 0 | 0.832409728225321 | |
| 7 | 1 | 19.2999312165329 | 1 | 1.0273647472728 | |
| 8 | 0 | 1600.50948046765 | 1 | 0.750024672644213 | |
| 9 | 1 | 524.368976549325 | 0 | 1.26851084339432 |
| /* | |
| * Licensed to the Apache Software Foundation (ASF) under one or more | |
| * contributor license agreements. See the NOTICE file distributed with | |
| * this work for additional information regarding copyright ownership. | |
| * The ASF licenses this file to You under the Apache License, Version 2.0 | |
| * (the "License"); you may not use this file except in compliance with | |
| * the License. You may obtain a copy of the License at | |
| * | |
| * http://www.apache.org/licenses/LICENSE-2.0 | |
| * |
I hereby claim:
To claim this, I am signing this object:
| import org.apache.spark._ | |
| import org.apache.spark.SparkContext._ | |
| import org.apache.spark.rdd.RDD | |
| import scala.util.Random | |
| import java.io._ | |
| import java.util.Properties | |
| import org.apache.hadoop.fs._; | |
| import org.apache.hadoop.conf._; | |
| import org.apache.hadoop.io._; |