Skip to content

Instantly share code, notes, and snippets.

View yupferris's full-sized avatar

Jake Taylor yupferris

View GitHub Profile
@1wErt3r
1wErt3r / SMBDIS.ASM
Created November 9, 2012 22:27
A Comprehensive Super Mario Bros. Disassembly
;SMBDIS.ASM - A COMPREHENSIVE SUPER MARIO BROS. DISASSEMBLY
;by doppelganger ([email protected])
;This file is provided for your own use as-is. It will require the character rom data
;and an iNES file header to get it to work.
;There are so many people I have to thank for this, that taking all the credit for
;myself would be an unforgivable act of arrogance. Without their help this would
;probably not be possible. So I thank all the peeps in the nesdev scene whose insight into
;the 6502 and the NES helped me learn how it works (you guys know who you are, there's no
@rygorous
rygorous / gist:4172889
Created November 30, 2012 00:28
SSE/AVX matrix multiply
#include <immintrin.h>
#include <intrin.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
union Mat44 {
float m[4][4];
__m128 row[4];
};
[core]
excludesfile = ~/.gitignore
[diff]
[color]
ui = auto
[alias]
st = status
ci = commit
co = checkout
di = diff
@kusma
kusma / atanlut.m
Last active January 31, 2017 13:26
ATAN lut generator
% ATAN lut-generator for Mesa3D
% Based on polyfitc, http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit
X = linspace(0, 1);
Y = atan(X)';
N = [1, 3, 5, 7, 9, 11];
lsqM = ones(numel(X), numel(N));
for n = 1:numel(N)
lsqM(:, n) = X.^N(n);
end
@KodrAus
KodrAus / Profile Rust on Linux.md
Last active August 12, 2024 12:37
Profiling Rust Applications

Profiling performance

Using perf:

$ perf record -g binary
$ perf script | stackcollapse-perf.pl | rust-unmangle | flamegraph.pl > flame.svg

NOTE: See @GabrielMajeri's comments below about the -g option.

There's a nice Xilinx application note from Ken Chapman:

Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources https://www.xilinx.com/support/documentation/application_notes/xapp522-mux-design-techniques.pdf

The most intriguing part is what he calls a data selector, but which might be more descriptively termed a one-hot mux, since it's a mux where the control signals are one-hot coded:

    onehotmux(data, sel) = (data[0] & sel[0]) | ... | (data[n] & sel[n])
//; "Super Keftendo" 256-byte SNES intro source code
//; by Revenant
//; http://www.pouet.net/prod.php?which=70163
//; This is an attempt at implementing the "Kefrens bars" effect on the SNES, using less than
//; 256 bytes of ROM. The technique used here is to set up a 256-color line buffer using
//; Mode 7, then rendering a few pixels directly to CGRAM every scanline and resetting the
//; Y-scroll position to display the same buffer on every visible scanline as it is repeatedly
//; rendered to. Some more information about specific size optimizations are detailed later.
Suppose we want to compute the frequency spectrum of an n-point sampled signal. That is,
we want to compute the signal's discrete Fourier transform. Taking a cue from multi-rate
signal processing, let's try a divide-and-conquer approach where we downsample the signal
by 2:1 and recursively compute the spectrum of that. There are two possible downsamplings,
corresponding to the even and odd phases.
By the Nyquist sampling theorem, assuming the signal has no upper half-band frequencies,
i.e. its top n/2 frequency bins are zero, the spectrum can be perfectly reconstructed
from the spectrum of _either_ the even subsignal or the odd subsignal, without any aliasing.
/**
* Simple UART module to explore basic HardwareC concepts.
*
* HardwareC is a working name for a new hardware description language. The
* goal is to make FPGAs easier for hobbyists to take advantage of. To achieve
* this goal, some design choices have been made:
*
* 1. Use familiar syntax. C/C++ syntax is borrowed everywhere, no reason to
* reinvent the wheel. Where C/C++ falls short, borrow from Verilog/SystemVerilog.
* 2. Interrop with C/C++. A HardwareC module should be able to be used seamlessly