Raph Levien raphlinus

Kernel 2 processes all the segments in the fill and stroke items. Here we'll concentrate on fill (stroke is similar).

Its input is: a list of fill items for this tilegroup, from kernel 1. Also access to the scene, for the items, and for the lists of points.

Its output is: for each item, a background fill and a list of segments. (there's potential complexity that the segments can be "fill" and "fill edge").

This note refers to the piet-metal source extensively. For the most part, it does the PietItem_Fill case (lines 248..365).

Some simplifications: we'll consider the item list a vec, with len and index operations. In practice, it is likely to be fragmented, to make dynamic allocation easier for kernel 1. We'll also write the code for output in pseudocode (it will have to do similar dynamic alloc tricks).

Winter status update & 0.5 Roadmap

Goals & Non-goals

Development of druid is currently driven by the needs of [Runebender], a font editor, and this will continue to be true for the scope of this roadmap. Runebender is a creative desktop application, supporting Windows, macOS, and Linux (via Gtk).

A major goal for Runebender, and thus druid, is to offer a polished user experience. There are many factors to this goal, including performance, a rich palette of interactions (thus a widget library to support them), and playing well with the native platform.

This last point deserves more explanation. The intent of druid is not that you can write a single program that will magically look and feel native on all supported platforms. It's questionable whether such a thing can be done, and chasing it leads to a "lowest common denominator" approach. Rather, the goal of druid is to make it possible to create an app which respects platform conventions and expectations around things like window management, menus

	backend: metal, device: Intel(R) Iris(TM) Plus Graphics 640
	metal-threadgroup-Intel(R) Iris(TM) Plus Graphics 640
	kernel type: threadgroup
	cpu_execs: 2, gpu_execs: 5001
	transpose-threadgroup-WGS=(1,32) kernel already compiled...
	num bms: 4096, num dispatch groups: 4096
	GPU results verified!
	task name:metal-threadgroup-WGS=(32, 32)
	TG size: 32
	timestamp stats (N = 2): 0.00 +/- 0.00 ms

	compiling kernel transpose-hybrid-shuffle-WGS=(32,1)...
	num bms: 4096, num dispatch groups: 4096
	GPU results verified!
	task name:Vk-HybridShuffle-TG=32
	device: Intel(R) Iris(TM) Plus Graphics 640
	num BMs: 4096, TG size: 32
	CPU loops: 101, GPU loops: 1001
	timestamp stats (N = 101): 0.00 +/- 0.00 ms
	instant stats (N = 101): 108.47 +/- 8.75 ms

	def ctz(x):
	if x == 0: return 32
	r = 0
	while (x % 2) == 0:
	r += 1
	x >>= 1
	return r

	def clz(x):
	for k in range(31, -1, -1):

	struct StackElement {
	PietGroupRef group;
	uint index;
	float2 offset; // Maybe pack as short2?
	}

	kernel1(Buf scene, PietGroupRef root) {
	StackElement stack[MAX_STACK];
	uint stack_ix = 0;
	uint group = root;

	inline uint shuffle_round(uint a, uint m, ushort s) {
	uint b = simd_shuffle_xor(a, s);
	uint c;
	if ((tix & s) == 0) {
	c = b << s;
	} else {
	m = ~m;
	c = b >> s;
	}
	return (a & m) \| (c & ~m);

	inline uint extract_8bit_value(uint bit_shift, uint package) {
	uint mask = 255;
	uint result = (package >> bit_shift) & mask;

	return result;
	}

	inline uint extract_16bit_value(uint bit_shift, uint package) {
	uint mask = 65535;
	uint result = (package >> bit_shift) & mask;

	use std::path::Path;
	use std::fs::File;
	use std::io::BufWriter;

	#[derive(Clone, Copy)]
	struct LinearRGB {
	r: f32,
	g: f32,
	b: f32,
	}