Skip to content

Instantly share code, notes, and snippets.

View crazyguitar's full-sized avatar
🎯
Focusing

CHANG-NING TSAI crazyguitar

🎯
Focusing
View GitHub Profile
@dideler
dideler / bot.rb
Last active October 23, 2025 16:33
Sending a notification message to Telegram using its HTTP API via cURL
# Use this script to test that your Telegram bot works.
#
# Install the dependency
#
# $ gem install telegram_bot
#
# Run the bot
#
# $ ruby bot.rb
#
@mcarilli
mcarilli / commands.md
Last active October 16, 2025 08:53
Single- and multiprocess profiling workflow with nvprof and NVVP (Nsight Systems coming soon...)

Ordinary launch commands (no profiling):

Single-process:

python main_amp.py -a resnet50 --b 224 --deterministic --workers 4 --opt-level O1 ./bare_metal_train_val/

Multi-process:

python -m torch.distributed.launch  --nproc_per_node=2 main_amp.py -a resnet50 --b 224 --deterministic --workers 4 --opt-level O1 ./bare_metal_train_val/
@crazyguitar
crazyguitar / mapread.c
Created January 17, 2020 16:43 — forked from marcetcheverry/mapread.c
mmap and read/write string to file
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <unistd.h>
int main(int argc, const char *argv[])
{
@mcarilli
mcarilli / nsight.sh
Last active October 29, 2025 07:13
Favorite nsight systems profiling commands for Pytorch scripts
# This isn't supposed to run as a bash script, i named it with ".sh" for syntax highlighting.
# https://developer.nvidia.com/nsight-systems
# https://docs.nvidia.com/nsight-systems/profiling/index.html
# My preferred nsys (command line executable used to create profiles) commands
#
# In your script, write
# torch.cuda.nvtx.range_push("region name")
# ...
@ih2502mk
ih2502mk / list.md
Last active November 2, 2025 23:34
Quantopian Lectures Saved
@crazyguitar
crazyguitar / m256.cc
Created October 5, 2021 20:18 — forked from kaityo256/m256.cc
//----------------------------------------------------------------------
#include <stdio.h>
#include <emmintrin.h>
#include <immintrin.h>
//----------------------------------------------------------------------
void
printm256(__m256d r){
double *a = (double*)(&r);
printf("%f %f %f %f\n",a[0],a[1],a[2],a[3]);
}
@ruvnet
ruvnet / MoE.py
Last active October 20, 2025 19:50
A PyTorch implementation of a Mixture of Experts (MoE) model resembling the Mixtral 8x7B architecture, with detailed inline comments. This model combines transformer layers with an MoE layer consisting of 8 experts, aiming for high efficiency by activating only 2 experts per token. It's configured with dimensions reflecting the operational effic…
"""
This model integrates the MoE concept within a Transformer architecture. Each token's
representation is processed by a subset of experts, determined by the gating mechanism.
This architecture allows for efficient and specialized handling of different aspects of the
data, aiming for the adaptability and efficiency noted in the Mixtral 8x7B model's design
philosophy. The model activates only a fraction of the available experts for each token,
significantly reducing the computational resources needed compared to activating all experts
for all tokens.
"""