Skip to content

Instantly share code, notes, and snippets.

View mmozeiko's full-sized avatar

Mārtiņš Možeiko mmozeiko

View GitHub Profile
@vurtun
vurtun / _readme_quarks.md
Last active June 26, 2025 00:06
Quarks: Graphical user interface

gui

@publik-void
publik-void / sin-cos-approximations-gist.adoc
Last active June 21, 2025 22:52
Fast MiniMax Polynomial Approximations of Sine and Cosine

Fast MiniMax Polynomial Approximations of Sine and Cosine

@vurtun
vurtun / x11_clipboard.c
Last active June 13, 2023 21:28
X11 clipboard
static char*
sys_clip_get(struct sys *s, Atom selection, Atom target)
{
assert(s);
struct sys_x11 *x11 = s->platform;
/* blocking wait for clipboard data */
XEvent notify;
XConvertSelection(x11->dpy, selection, target, selection, x11->helper, CurrentTime);
while (!XCheckTypedWindowEvent(x11->dpy, x11->helper, SelectionNotify, &notify)) {
@JarkkoPFC
JarkkoPFC / sphere_screen_extents.h
Last active May 28, 2025 13:51
Calculates view space 3D sphere extents on the screen
struct vec3f {float x, y, z;};
struct vec4f {float x, y, z, w;};
struct mat44f {vec4f x, y, z, w;};
//============================================================================
// sphere_screen_extents
//============================================================================
// Calculates the exact screen extents xyzw=[left, bottom, right, top] in
// normalized screen coordinates [-1, 1] for a sphere in view space. For
// performance, the projection matrix (v2p) is assumed to be setup so that
Perfect Quantization of DXT endpoints
-------------------------------------
One of the issues that affect the quality of most DXT compressors is the way floating point colors are rounded.
For example, stb_dxt does:
max16 = (unsigned short)(stb__sclamp((At1_r*yy - At2_r*xy)*frb+0.5f,0,31) << 11);
max16 |= (unsigned short)(stb__sclamp((At1_g*yy - At2_g*xy)*fg +0.5f,0,63) << 5);
max16 |= (unsigned short)(stb__sclamp((At1_b*yy - At2_b*xy)*frb+0.5f,0,31) << 0);
@animetosho
animetosho / gf2p8affineqb-articles.md
Last active July 4, 2025 10:44
A list of articles documenting uses of the GF2P8AFFINE instruction

Unexpected Uses for the Galois Field Affine Transformation Instruction

Intel added the Galois Field instruction set (GFNI) extensions to their Sunny Cove and Tremont cores. What’s particularly interesting is that GFNI is the only new SIMD extension that came with SSE and VEX/AVX encodings (in addition to EVEX/AVX512), to allow it to be supported on all future Intel cores, including those which don’t support AVX512 (such as the Atom line, as well as Celeron/Pentium branded “big” cores).

I suspect GFNI was aimed at accelerating SM4 encryption, however, one of the instructions can be used for many other purposes. The extension includes three instructions, but of particular interest here is the Affine Transformation (GF2P8AFFINEQB), aka bit-matrix multiply, instruction.

There have been various articles which discuss out-of-band

@dondragmer
dondragmer / PrefixSort.compute
Created January 20, 2021 23:32
An optimized GPU counting sort
#pragma use_dxc //enable SM 6.0 features, in Unity this is only supported on version 2020.2.0a8 or later with D3D12 enabled
#pragma kernel CountTotalsInBlock
#pragma kernel BlockCountPostfixSum
#pragma kernel CalculateOffsetsForEachKey
#pragma kernel FinalSort
uint _FirstBitToSort;
int _NumElements;
int _NumBlocks;
bool _ShouldSortPayload;
@christianparpart
christianparpart / terminal-synchronized-output.md
Last active August 28, 2025 03:44
Terminal Spec: Synchronized Output

Synchronized Output

Synchronized output is merely implementing the feature as inspired by iTerm2 synchronized output, except that it's not using the rare DCS but rather the well known SM ? and RM ?. iTerm2 has now also adopted to use the new syntax instead of using DCS.

Semantics

When rendering the screen of the terminal, the Emulator usually iterates through each visible grid cell and renders its current state. With applications updating the screen a at higher frequency this can cause tearing.

This mode attempts to mitigate that.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#if defined(__x86_64__)
#define BREAK asm("int3")
#else
#error Implement macros for your CPU.
#endif
@zingaburga
zingaburga / sve2.md
Last active July 23, 2025 06:47
ARM’s Scalable Vector Extensions: A Critical Look at SVE2 For Integer Workloads

ARM’s Scalable Vector Extensions: A Critical Look at SVE2 For Integer Workloads

Scalable Vector Extensions (SVE) is ARM’s latest SIMD extension to their instruction set, which was announced back in 2016. A follow-up SVE2 extension was announced in 2019, designed to incorporate all functionality from ARM’s current primary SIMD extension, NEON (aka ASIMD).

Despite being announced 5 years ago, there is currently no generally available CPU which supports any form of SVE (which excludes the [Fugaku supercomputer](https://www.fujitsu.com/global/about/innovation/