Skip to content

Instantly share code, notes, and snippets.

@o11c
o11c / every-vm-tutorial-you-ever-studied-is-wrong.md
Last active November 4, 2024 13:34
Every VM tutorial you ever studied is wrong (and other compiler/interpreter-related knowledge)

Note: this was originally several Reddit posts, chained and linked. But now that Reddit is dying I've finally moved them out. Sorry about the mess.


URL: https://www.reddit.com/r/ProgrammingLanguages/comments/up206c/stack_machines_for_compilers/i8ikupw/ Summary: stack-based vs register-based in general.

There are a wide variety of machines that can be described as "stack-based" or "register-based", but not all of them are practical. And there are a lot of other decisions that affect that practicality (do variables have names or only address/indexes? fixed-width or variable-width instructions? are you interpreting the bytecode (and if so, are you using machine stack frames?) or turning it into machine code? how many registers are there, and how many are special? how do you represent multiple types of variable? how many scopes are there(various kinds of global, local, member, ...)? how much effort/complexity can you afford to put into your machine? etc.)

  • a pure stack VM can only access the top elemen
@o11c
o11c / division.py
Created February 7, 2021 05:23
The 3 flavors of division.
#!/usr/bin/env python3
import functools
import gmpy2
# assuming 8-bit math
# all functions are written to take any input, and produce signed output
def make_unsigned(v):
return v & 0xff

First, some notes:

  • Last checked for manpages-5.04 and linux-5.4, on Debian.
  • /proc/net is still used for documentation purposes, despite now being a symlink to /proc/self/net/
  • /proc/[pid]/task/[tid]/* is documented as including all of /proc/[pid]/* but this is not actually the case
  • Some files are actually documented in other man pages. But proc(5) needs to still mentions them (possibly just the containing directory though).
  • Some files are actually documented in the kernel's Documentation/ tree (but that also is incomplete). Even if proc(5) mentions that, this is suboptimal. Further, many links to Documentation/ are broken since a recent reorganization.
  • The /proc/sys/net/ reference is quite vague, and incomplete besides.
  • - means the file exists but is not documented
  • + means either it is documented but does not exist, or it exists but its contents are not documented
  • I've remove some clutter by hand and added a few notes.
<ChancyValue:
0.0000000000% chance of: 104
0.0000000000% chance of: 105
0.0000000000% chance of: 106
0.0000000000% chance of: 107
0.0000000000% chance of: 108
0.0000000000% chance of: 109
0.0000000000% chance of: 110
0.0000000000% chance of: 111
0.0000000000% chance of: 112
# Significant care is taken to be sh-compatible; if bash or zsh could be
# required, it could be made simpler or more generic.
# Known source'rs:
# ~/.profile
# ~/.zshrc
# ~/.xprofile
# ~/.xsessionrc
# ~/.bashrc
# ~/.config/plasma-workspace/env/*.sh

List of intended memory-management policies (note: the actual policies are below, after both kinds of modifiers):

structure modifiers:

  • relative - own address is added to make effective pointer. Useful for realloc and memcpy, as well as shared memory.
  • middle_pointer - actually points to the middle of an object, with a known way to find the start
    • note particularly how this interacts with subclasses. Possibly that should be the only way to create such a pointer? It isn't too crazy to synthesize a subclass just for the CowString trick ...
  • (other arithmetic tricks possible: base+scale+offset, with each of these possibly hard-coded (which requires that there be multiple copies of some ownership policies! we can't just use an enum) or possibly embedded)
  • (but what about "object is stored in a file too large to mmap" and such? Or should those only satisfy "ChunkyRandomIterator" concept?)
How to create privilegeless (only util-linux and shadow-utils) chroots.
FOR LINUX HOSTS ONLY!
On some machine with real root (or in a VM):
1. preferably, install an apt proxy, such as apt-cacher-ng
2. run debootstrap (or equivalent) as real root (or see if fakeroot works?)
3. delete the device files under /dev
4. create a tarball using --numeric-owner (VERY IMPORTANT)
#define REALLY_JOIN(a, b) a##b
#define JOIN(a, b) REALLY_JOIN(a, b)
#define ARG_16(a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15, ...) a15
#define COUNT(...) ARG_16(__VA_ARGS__, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
#define FOO(...) JOIN(FOO_, COUNT(__VA_ARGS__))(__VA_ARGS__)
#include <stdio.h>
struct Options
{
bool debug_events;
bool dump;
bool hello;
bool info;
const char *output;
Options()
{
# Copyright © 2017 Ben Longbons
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the