wolfwood · November 15, 2022 01:01
diff --git a/activations b/activations
 ============= Interrupt Primer =====================================

 Computer: a deterministic monolith crunching away until infinity one
 instruction after the other, as preordained by The Programmer.

 If this is not your experience, it is because of interrupts (NB: it is
 NOT because we don't write infinite loops, they are hidden down in the
 bottom of most OSes and GUI and console applications).  The idea
 behind interrupts is that we can pause what the CPU is doing, handle
 some new information (like key presses or the printer finishing a
 document), and return to the initial task.  Interrupts are the Great
 Nondeterminism of the computing world (what they trigger is also
 predetermined, but when is dynamic).

 what causes interrupts: I/O and timers --- outside state changes

 alternative: polling :(

 Current hardware and OSes have taken the idea of interrupts to an
 extreme: because they can restore the state of the CPU prior to the
 interrupt without the interrupted process noticing, they do so with
 extreme prejudice.

 virtual CPU: never not know you aren't running

 alternative++: activations

 ============= How XOmB implements Activations ======================

 An activation is a piece of memory used to communicate between the
 kernel and userspace.  Currently all environments are created with a
 2MB segment at address 1GB - 2MB, the only memory accessible in the
 first 1GB. XOmB currently manages allocating this memory, but if we
 move to userspace allocation this will become quite tricky unless we
 can guarantee that it is possible for a correct program to always have
 a free page available for allocation activations (and then anything
 that faults in this segment is labeled incorrect and killed :).

 The Activation, for simplicity, contains an InterruptStack struct (to
 store the CPU state that is suspended on an interrupt) along with some
 additional information for unwinding activations that occur during the
 restoration of an activation (chained activations are my primary worry
 regarding correctness, races and allocation issue), and finally a bool
 to indicate whether the activation is valid, so that the kernel can
 find a free activation to use when needed.

 The underlying XOmB interrupt mechanic is not changed by activations,
 the same templated code pushes registers to the interrupt stack but
 instead of calling an interrupt handler, the activation dispatcher (an
 un-scheduler if you will) is called.  the dispatcher first finds a
 free activation (XXX: in a lock-free manner that marks it as no longer
 free) in the environments activation segment. Then the saved state of
 the InterruptStack is copied to the activation.  The interrupt is
 acknowledged to the local APIC (to prevent denial of service) and
 userspace is reentered using the same mechanism as initial entry and
 the yield system call, with 2 parameters: an entry index of 4 and the
 address of the activation used.

 It may seem like it would be possible to avoid this copy by using the
 interrupt stack AS the activation. The down sides of this approach are
 a) need a whole page or more as the activation instead of ~100bytes b)
 the activation must be read only to prevent corruption by userspace
 code running on an adjacent CPU of an in-use kernel stack c) that
 either the activation must remain kernel allocated, or we must edit
 the ISR in the TSS on context switch and manage a race with any
 interrupts that occur after we enter an environment but before we've
 located a preallocated activation page that is free.

 Because the only interrupt at the moment is a timer, userspace
 currently uses the parameters passed from the kernel to call the
 _entry function which restore the registers saved by the common
 interrupt handler and then uses iretq to restore the hardware saved
 registers and the RSP and RIP atomically. and this point it is too
 late to mark the activation as free, so we currently leak activations.

 while it may seem like the way to cure leaks is to begin with not
 using iretq, restoring RSP and RIP without iretq is nearly impossible
 because all registers will be occupied with application data and the
 application stack cannot be assumed to be free below the pointer
 (redzone optimization) but an indirect mov must be used to restore the
 RIP (a preallocated address would risk overwrite from other CPUs also
 restoring activations).  It was theoretically possible to work around
 this by storing the activation address in FSbase segment register and
 doing FS relative addressing but this adds the FSbase register to the
 state that must be preserved for userspace and complicates chained
 activations.

 So what about interrupts that we actually want to handle? For
 throughput oriented workloads it may be reasonable to simply note that
 the interrupt occurred, either by editing a bitmap that is checked
 periodically by a 'process interrupts' thread (what if we want more
 than one CPU to be able to process interrupts?)  or by enqueuing a
 preallocated thread to run the handler for the particular interrupt at
 hand (what if we get two interrupts before the thread is scheduled?).

 However, at least some interrupts will take priority over the
 currently running thread. In this case we may need to allocate new
 thread to handle the interrupt and to either allocate a thread to
 restore the activation (as the activation may be in the stackless
 thread scheduler code, or an interrupted activation recovery itself,
 we cannot assume there is an existing thread to be added to the
 scheduler and in any case an alternate 'enter from activation' would
 need to be communicated).

 Either path is sticky, and complicated farther by the fact that we
 would ultimately like to be passing the interrupt initially to the
 init process, so that the interrupt may be routed to another
 environment entirely for quick handling without denial of service by
 the present environment, but of course an activation that is not
 immediately communicated to the suspended environment is no better
 than a standard UNIX 'virtual CPU' that can be revoked without
 warning.
	============= Interrupt Primer =====================================

	Computer: a deterministic monolith crunching away until infinity one
	instruction after the other, as preordained by The Programmer.

	If this is not your experience, it is because of interrupts (NB: it is
	NOT because we don't write infinite loops, they are hidden down in the
	bottom of most OSes and GUI and console applications). The idea
	behind interrupts is that we can pause what the CPU is doing, handle
	some new information (like key presses or the printer finishing a
	document), and return to the initial task. Interrupts are the Great
	Nondeterminism of the computing world (what they trigger is also
	predetermined, but when is dynamic).

	what causes interrupts: I/O and timers --- outside state changes

	alternative: polling :(

	Current hardware and OSes have taken the idea of interrupts to an
	extreme: because they can restore the state of the CPU prior to the
	interrupt without the interrupted process noticing, they do so with
	extreme prejudice.

	virtual CPU: never not know you aren't running

	alternative++: activations

	============= How XOmB implements Activations ======================

	An activation is a piece of memory used to communicate between the
	kernel and userspace. Currently all environments are created with a
	2MB segment at address 1GB - 2MB, the only memory accessible in the
	first 1GB. XOmB currently manages allocating this memory, but if we
	move to userspace allocation this will become quite tricky unless we
	can guarantee that it is possible for a correct program to always have
	a free page available for allocation activations (and then anything
	that faults in this segment is labeled incorrect and killed :).

	The Activation, for simplicity, contains an InterruptStack struct (to
	store the CPU state that is suspended on an interrupt) along with some
	additional information for unwinding activations that occur during the
	restoration of an activation (chained activations are my primary worry
	regarding correctness, races and allocation issue), and finally a bool
	to indicate whether the activation is valid, so that the kernel can
	find a free activation to use when needed.

	The underlying XOmB interrupt mechanic is not changed by activations,
	the same templated code pushes registers to the interrupt stack but
	instead of calling an interrupt handler, the activation dispatcher (an
	un-scheduler if you will) is called. the dispatcher first finds a
	free activation (XXX: in a lock-free manner that marks it as no longer
	free) in the environments activation segment. Then the saved state of
	the InterruptStack is copied to the activation. The interrupt is
	acknowledged to the local APIC (to prevent denial of service) and
	userspace is reentered using the same mechanism as initial entry and
	the yield system call, with 2 parameters: an entry index of 4 and the
	address of the activation used.

	It may seem like it would be possible to avoid this copy by using the
	interrupt stack AS the activation. The down sides of this approach are
	a) need a whole page or more as the activation instead of ~100bytes b)
	the activation must be read only to prevent corruption by userspace
	code running on an adjacent CPU of an in-use kernel stack c) that
	either the activation must remain kernel allocated, or we must edit
	the ISR in the TSS on context switch and manage a race with any
	interrupts that occur after we enter an environment but before we've
	located a preallocated activation page that is free.

	Because the only interrupt at the moment is a timer, userspace
	currently uses the parameters passed from the kernel to call the
	_entry function which restore the registers saved by the common
	interrupt handler and then uses iretq to restore the hardware saved
	registers and the RSP and RIP atomically. and this point it is too
	late to mark the activation as free, so we currently leak activations.

	while it may seem like the way to cure leaks is to begin with not
	using iretq, restoring RSP and RIP without iretq is nearly impossible
	because all registers will be occupied with application data and the
	application stack cannot be assumed to be free below the pointer
	(redzone optimization) but an indirect mov must be used to restore the
	RIP (a preallocated address would risk overwrite from other CPUs also
	restoring activations). It was theoretically possible to work around
	this by storing the activation address in FSbase segment register and
	doing FS relative addressing but this adds the FSbase register to the
	state that must be preserved for userspace and complicates chained
	activations.

	So what about interrupts that we actually want to handle? For
	throughput oriented workloads it may be reasonable to simply note that
	the interrupt occurred, either by editing a bitmap that is checked
	periodically by a 'process interrupts' thread (what if we want more
	than one CPU to be able to process interrupts?) or by enqueuing a
	preallocated thread to run the handler for the particular interrupt at
	hand (what if we get two interrupts before the thread is scheduled?).

	However, at least some interrupts will take priority over the
	currently running thread. In this case we may need to allocate new
	thread to handle the interrupt and to either allocate a thread to
	restore the activation (as the activation may be in the stackless
	thread scheduler code, or an interrupted activation recovery itself,
	we cannot assume there is an existing thread to be added to the
	scheduler and in any case an alternate 'enter from activation' would
	need to be communicated).

	Either path is sticky, and complicated farther by the fact that we
	would ultimately like to be passing the interrupt initially to the
	init process, so that the interrupt may be routed to another
	environment entirely for quick handling without denial of service by
	the present environment, but of course an activation that is not
	immediately communicated to the suspended environment is no better
	than a standard UNIX 'virtual CPU' that can be revoked without
	warning.