Inheritance and Virtual Table are often used to create interface in C++ polymorphic class
What if ... there were another way to do this ?
easier, cleaner, faster and more reliable
This article explains how to useCRTP
, [std::variant
](https://en.cppreference.com/w/cpp/utility/variant andstd::visit
to increase code performance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
The 2024 Transformer (the Noam Transformer): | |
- RMSNorm | |
- GQA or some combination | |
- Sliding window attention | |
- Swiglu | |
- RoPE (Rotary Positional Embedding) | |
LLM Arches: | |
hidden | MLP mult. | n_layers | rope_theta | GQA Group Size | GLU Act. | ops |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
import functools | |
from torch.utils._python_dispatch import TorchDispatchMode | |
import torch.utils._pytree as pytree | |
from torch.utils.weak import WeakTensorKeyDictionary | |
class RecomputableTensor(torch.Tensor): | |
@staticmethod | |
def __new__(cls, t, func, args): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
`--> TORCH_LOGS="output_code" python optim_repro.py | |
[WARNING]:Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
[DEBUG]:Output code: | |
# AOT ID: ['0_inference'] | |
from ctypes import c_void_p, c_long | |
import torch | |
import math | |
import random | |
import os |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
import torch._dynamo as torchdynamo | |
import torch._inductor | |
import time | |
import torch._inductor.config as config | |
from torch._dynamo.utils import cprofile_wrapper | |
from apex.optimizers import FusedAdam, FusedSGD | |
config.triton.cudagraphs = True | |
config.cpp_wrapper = False |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This isn't supposed to run as a bash script, i named it with ".sh" for syntax highlighting. | |
# https://developer.nvidia.com/nsight-systems | |
# https://docs.nvidia.com/nsight-systems/profiling/index.html | |
# My preferred nsys (command line executable used to create profiles) commands | |
# | |
# In your script, write | |
# torch.cuda.nvtx.range_push("region name") | |
# ... |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If you want to use VSCode's fantastic feature: Dev Container, put the `devcontainer.json` in `.devcontainer` directory. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
import chainer | |
import chainer.functions as F | |
import chainer.links as L | |
import numpy as np | |
class FRN(chainer.Chain): | |
def __init__(self, in_c): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pip install streamlit | |
pip install spacy | |
python -m spacy download en_core_web_sm | |
python -m spacy download en_core_web_md | |
python -m spacy download de_core_news_sm |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')): | |
""" Filter a distribution of logits using top-k and/or nucleus (top-p) filtering | |
Args: | |
logits: logits distribution shape (vocabulary size) | |
top_k >0: keep only top k tokens with highest probability (top-k filtering). | |
top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering). | |
Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751) | |
""" | |
assert logits.dim() == 1 # batch size 1 for now - could be updated for more but the code would be less clear | |
top_k = min(top_k, logits.size(-1)) # Safety check |
NewerOlder