Skip to content

Instantly share code, notes, and snippets.

View elvircrn's full-sized avatar

Elvir Crnčević elvircrn

  • Red Hat
  • Brno/Graz/Sarajevo
View GitHub Profile

Overclocking the GPU

sudo /home/nvidia/jetson_clocks.sh --store
sudo /home/nvidia/jetson_clocks.sh
sudo /home/nvidia/jetson_clocks.sh --restore
import re
import urllib
import urllib.request
import os
import concurrent.futures
import zipfile
import shutil
def download_stuff(comic_url, comic_name):
@elvircrn
elvircrn / tmux.conf
Created December 20, 2019 14:49
~/.tmux.conf
# Enable mouse mode (tmux 2.1 and above)
set -g mouse on
set -g default-terminal "screen-256color" # colors!
setw -g xterm-keys on
set -s escape-time 10 # faster command sequences
set -sg repeat-time 600 # increase repeat timeout
set -s focus-events on
#include <ATen/cuda/CUDAContext.h>
#include <c10/util/Float8_e4m3fn.h>
#include "../per_token_group_quant_8bit.h"
#include <cmath>
#include <cuda_fp16.h>
#include <cuda_bf16.h>
apiVersion: leaderworkerset.x-k8s.io/v1
kind: LeaderWorkerSet
metadata:
name: wide-ep-llm-d-decode
labels:
llm-d.ai/inferenceServing: "true"
llm-d.ai/model: Qwen3-30B-A3B
llm-d.ai/role: decode
spec:
replicas: 1
apiVersion: leaderworkerset.x-k8s.io/v1
kind: LeaderWorkerSet
metadata:
name: wide-ep-llm-d-prefill
labels:
llm-d.ai/inferenceServing: "true"
llm-d.ai/model: Qwen3-30B-A3B
llm-d.ai/role: prefill
spec:
replicas: 1