Skip to content

Instantly share code, notes, and snippets.

View crazyguitar's full-sized avatar
🎯
Focusing

CHANG-NING TSAI crazyguitar

🎯
Focusing
View GitHub Profile
@crazyguitar
crazyguitar / coro.cpp
Created January 8, 2025 07:23 — forked from Qix-/coro.cpp
C++20 coroutines + LibUV sample, v2
// Thank you to the folks at the C++ slack channel,
// along with @lewissbaker for the excellent literature
// (even though it took me a few days to be convinced
// it really was so).
#include <uv.h>
#include <iostream>
#include <experimental/coroutine>
@crazyguitar
crazyguitar / dag.py
Created December 25, 2024 00:13 — forked from OhadRubin/dag.py
import networkx as nx
from itertools import product
"""
When we compare this code with Airflow, the strengths of your code lie in its simplicity, lightweight nature, and the ability to easily integrate with existing Python code:
Simplicity: This code provides a simple and straightforward way to model and work with DAGs without needing to go through the process of setting up and configuring a comprehensive system like Airflow. For smaller teams or projects with less complexity, this can be an advantage.
Lightweight and easy to incorporate: Your code is a compact, single-file solution that can be easily integrated into an existing Python project without having to set up an entire Airflow environment. When your primary focus is on creating task dependencies with parameter combinations, rather than scheduling and monitoring, your code is easier to incorporate.
Focused on task generation: Your code emphasizes creating a Cartesian product of tasks associated with nodes' parameters. It is geared towards tackling
@crazyguitar
crazyguitar / nsight.sh
Created October 4, 2024 18:23 — forked from mcarilli/nsight.sh
Favorite nsight systems profiling commands for Pytorch scripts
# This isn't supposed to run as a bash script, i named it with ".sh" for syntax highlighting.
# https://developer.nvidia.com/nsight-systems
# https://docs.nvidia.com/nsight-systems/profiling/index.html
# My preferred nsys (command line executable used to create profiles) commands
#
# In your script, write
# torch.cuda.nvtx.range_push("region name")
# ...
@crazyguitar
crazyguitar / bench.py
Created June 15, 2024 02:06 — forked from marians/bench.py
Benchmarking serialization/unserialization in python using json, pickle and cPickle
import cPickle
import pickle
import json
import random
from time import time
from hashlib import md5
test_runs = 1000
def float_list():
@crazyguitar
crazyguitar / commands.md
Created June 11, 2024 20:13 — forked from mcarilli/commands.md
Single- and multiprocess profiling workflow with nvprof and NVVP (Nsight Systems coming soon...)

Ordinary launch commands (no profiling):

Single-process:

python main_amp.py -a resnet50 --b 224 --deterministic --workers 4 --opt-level O1 ./bare_metal_train_val/

Multi-process:

python -m torch.distributed.launch  --nproc_per_node=2 main_amp.py -a resnet50 --b 224 --deterministic --workers 4 --opt-level O1 ./bare_metal_train_val/

Often, there's a misconception about the Global Interpreter Lock (GIL) in Python, where it's believed that the Python interpreter only permits one thread to run at any given time. Infact, Python Interpreter allows threads running in parallel when GIL is released in a function. For instance, functions like time.sleep leverage I/O blocking within Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS, enabling the Python interpreter to handle multiple threads simultaneously in such cases.

reference:

  1. pysleep

The provided reference demonstrates how Python allows sleep operations to run in parallel:

@crazyguitar
crazyguitar / CUDA-12-1-1-pytorch.md
Created April 10, 2024 21:33 — forked from Birch-san/CUDA-12-1-1-pytorch.md
Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Should you keep your NVIDIA driver?

CUDA 12.1.1 toolkit is gonna offer to install Nvidia driver 530 for us. It's from New Feature branch. It's likely to be newer than the default Nvidia driver you would've installed via apt-get (apt would prefer to give you 525, i.e. Production Branch).

If you're confident that you already have a new enough Nvidia driver for CUDA 12.1.1, and you'd like to keep your driver: feel free to skip this "uninstall driver" step.

But if you're not sure, or you know your driver is too old: let's uninstall it. CUDA will install a new driver for us later.

@crazyguitar
crazyguitar / example.c
Created April 4, 2024 15:23 — forked from plebioda/example.c
libfabric example
#include <rdma/fabric.h>
#include <rdma/fabric.h>
#include <rdma/fi_endpoint.h>
#include <rdma/fi_cm.h>
#include <rdma/fi_errno.h>
#include <rdma/fi_rma.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
@crazyguitar
crazyguitar / gist:d86c83645088cc829752d288a5763f39
Created November 28, 2023 04:34 — forked from arbabnazar/gist:6b9909cfba52ac066512ba5d1c1a1080
Example for Ansible git-module and ssh agent forwarding
# files/env:
Defaults env_keep += "SSH_AUTH_SOCK"
# tasks/main.yml
- name: ensure sudo keeps SSH_AUTH_SOCK in environment
copy: src=env
dest=/etc/sudoers.d/env
mode=0440
owner=root
group=root