Skip to content

Instantly share code, notes, and snippets.

View insujeon's full-sized avatar

Insu Jeon insujeon

View GitHub Profile
@insujeon
insujeon / matmul_free.md
Created August 15, 2024 13:58 — forked from pszemraj/matmul_free.md
Technical Overview and Explanation of "Scalable MatMul-free Language Modeling" by gpt-4o

Technical Overview and Explanation of "Scalable MatMul-free Language Modeling"

Introduction

This paper presents a novel approach to large language models (LLMs) that eliminates matrix multiplication (MatMul) operations, which are typically the most computationally expensive part of such models. By doing so, the authors aim to significantly reduce memory usage and improve computational efficiency, enabling the models to scale up to billions of parameters while maintaining performance comparable to state-of-the-art Transformers.

Key Contributions

  1. MatMul-Free Dense Layers: The core innovation lies in replacing MatMul operations in dense layers with addition operations using ternary weights. These ternary weights take values from {-1, 0, +1}, which allows matrix multiplications to be transformed into simple additions and subtractions.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
from torchvision import datasets, transforms
import math
import numpy as np
# Hardcoded variables for hyperfan init
@insujeon
insujeon / 30-touchpad.conf
Created April 28, 2022 07:17 — forked from miguelmota/30-touchpad.conf
Arch linux enable tap to click on touchpad
Section "InputClass"
Identifier "touchpad"
Driver "libinput"
MatchIsTouchpad "on"
Option "Tapping" "on"
Option "TappingButtonMap" "lmr"
EndSection
@insujeon
insujeon / get-ieee.txt
Created March 4, 2022 04:51 — forked from paulaksm/get-ieee.txt
Download IEEE papers from command line
# If under a valid institutional IP address, the followng command will download an IEEE hosted paper of a specific <ID-NUMBER>
and saved it as paper.pdf
wget "http://ieeexplore.ieee.org/stampPDF/getPDF.jsp?tp=&isnumber=&arnumber=<ID-NUMBER>" -O paper.pdf

Welcome to WordGrinder

Important note for Windows users

WordGrinder is a port of a Unix program, and a few things don’t map well onto the way Windows works. There are some things you need to know.

  • the mouse is ignored. WordGrinder is keyboard driven.
  • the close button at the top right hand corner of the window won’t work. Instead, to quit, type CTRL+Q (or press ESC to open the menu and pick File_→_Quit).
@insujeon
insujeon / index.html
Created August 12, 2019 07:16
Random Quote Generator (HTML, CSS, JS)
<div class="container">
<div class="row">
<div class="col-xs-12 col-sm-3"></div>
<div class="col-xs-12 col-sm-6 quotebox">
<div class="row">
<div class="col-xs-12">
<blockquote>
<i class="fa fa-quote-left quotemark"></i>
<h2 id="quotetext"></h2>
<footer id="quotesource"></footer>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@insujeon
insujeon / gan_1d.py
Created October 27, 2017 18:28 — forked from vvanirudh/gan_1d.py
GAN to model a 1D gaussian distribution
# Drawn from https://gist.github.com/rocknrollnerd/06bfed6b9d1bce612fd6 (in theano)
# This is implemented in PyTorch
# Author : Anirudh Vemula
import numpy as np
import torch
import torch.nn as nn
from torch.autograd import Variable
from scipy.stats import norm
import matplotlib.pyplot as plt
@insujeon
insujeon / bayes_by_backprop.py
Created October 27, 2017 18:28 — forked from vvanirudh/bayes_by_backprop.py
Bayes by Backprop in PyTorch (introduced in the paper "Weight uncertainty in Neural Networks", Blundell et. al. 2015)
# Drawn from https://gist.github.com/rocknrollnerd/c5af642cf217971d93f499e8f70fcb72 (in Theano)
# This is implemented in PyTorch
# Author : Anirudh Vemula
import torch
import torch.nn as nn
from torch.autograd import Variable
import numpy as np
from sklearn.datasets import fetch_mldata
def sample_gumbel(shape, eps=1e-20):
"""Sample from Gumbel(0, 1)"""
U = tf.random_uniform(shape,minval=0,maxval=1)
return -tf.log(-tf.log(U + eps) + eps)
def gumbel_softmax_sample(logits, temperature):
""" Draw a sample from the Gumbel-Softmax distribution"""
y = logits + sample_gumbel(tf.shape(logits))
return tf.nn.softmax( y / temperature)