nxnfufunezn nxnfufunezn

Achieving warp speed with Rust

Using `<details>` in GitHub

Suppose you're opening an issue and there's a lot noisey logs that may be useful.

Rather than wrecking readability, wrap it in a <details> tag!

<details>
 <summary>Summary Goes Here</summary>

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

Persistent "pipes" in Linux

In a project I'm working on I ran into the requirement of having some sort of persistent FIFO buffer or pipe in Linux, i.e. something file-like that could accept writes from a process and persist it to disk until a second process reads (and acknowledges) it. The persistence should be both across process restarts as well as OS restarts.

AFAICT unfortunately in the Linux world such a primitive does not exist (named pipes/FIFOs do not persist

Distributed Read-Write Mutex in Go

The default Go implementation of sync.RWMutex does not scale well to multiple cores, as all readers contend on the same memory location when they all try to atomically increment it. This gist explores an n-way RWMutex, also known as a "big reader" lock, which gives each CPU core its own RWMutex. Readers take only a read lock local to their core, whereas writers must take all locks in order.

#Create bitbucket branch

##Create local branch

$ git checkout -b sync
Switched to a new branch 'sync'
$ git branch
  master
* sync

	Registers
	Caller-saved Callee-saved
	RAX RCX RSP RDI RSI RDX R8 R9 R10 R11 RBP RBX R12 R13 R14 R15

	Args: RDI, RSI, RDX, RCX, R8, R9, XMM0–7
	Return: RAX

	Simple Compile
	yasm -f macho64 foo.asm && gcc foo.c foo.o -Wall -Wextra -g -O1

	Why do compilers even bother with exploiting undefinedness signed overflow? And what are those
	mysterious cases where it helps?

	A lot of people (myself included) are against transforms that aggressively exploit undefined behavior, but
	I think it's useful to know what compiler writers are accomplishing by this.

	TL;DR: C doesn't work very well if int!=register width, but (for backwards compat) int is 32-bit on all
	major 64-bit targets, and this causes quite hairy problems for code generation and optimization in some
	fairly common cases. The signed overflow UB exploitation is an attempt to work around this.

	#include <stdio.h>

	#define STR2(x) #x
	#define STR(x) STR2(x)

	#define INCBIN(name, file) \
	__asm__(".section .rodata\n" \
	".global incbin_" STR(name) "_start\n" \
	".type incbin_" STR(name) "_start, @object\n" \
	".balign 16\n" \

	#include <stdint.h>

	/*

	Fast 64bit integer log10
	WARNING: calling ilog10c(0) yields undefined behaviour!

	On x64 this compiles down to:

	pushq %rbp