Skip to content

Instantly share code, notes, and snippets.

@dims
dims / 2026-06-03-substrate-cross-vendor-contributors.md
Created June 4, 2026 13:35
Agent Substrate (agent-substrate/substrate) — Cross-Vendor Contributor & Affiliation Report (2026-06-03)

Agent Substrate — Cross-Vendor Contributor & Affiliation Report

Generated: 2026-06-03 Repo: agent-substrate/substrate"Agent Substrate: the core system" (public, 468★, Apache-2.0) What it is: a system on top of Kubernetes that manages agent-like workloads at higher scale/lower latency by taking the K8s control-plane out of the critical path — actors run in gVisor sandboxes (ateom), managed by a kubelet-like agent (atelet), with GCS checkpoint/restore (ategcs) and a router (atenet). Window: 2026-05-13 → 2026-06-03 (~3 weeks — a brand-new seed project) Volume analyzed: 95 commits · 117 PRs (all states) · 63 issues (all states) Analysis basis: upstream agent-substrate/substrate@main (e26cfa22), cloned fresh — the local dims/substrate fork checkout (4cbac18) was a few commits behind.

Framing: Unlike the NVIDIA-owned reports (nvsentinel / dra-driver / aicr / OpenShell), this repo is not NVIDIA-owned — it's a

@dims
dims / external-contributor-report.md
Created June 3, 2026 14:40
OpenShell (NVIDIA/OpenShell) — External Contributor & DCO-Hygiene Report (2026-05-26)

External Contributor & DCO-Hygiene Report — nvidia/OpenShell

  • Generated: 2026-05-26
  • Repository: nvidia/OpenShell (working copy: /Users/dsrinivas/go/src/github.com/nvidia/OpenShell)
  • Total commits analyzed (full main history): 754
  • Total unique commit-author emails: 58
  • Total unique GitHub handles (resolved): 51 (excluding bots)

Methodology summary

@dims
dims / 2026-06-03-aicr-external-contributors.md
Created June 3, 2026 14:30
aicr (NVIDIA/aicr) — External Contributor & DCO-Hygiene Report (2026-06-03)

aicr — External Contributor & DCO-Hygiene Report

Generated: 2026-06-03 Repo: NVIDIA/aicr"Tooling for optimized, validated, and reproducible GPU-accelerated AI runtime in Kubernetes" (323★) History analyzed: 2026-01-30 → 2026-06-03 (~4 months), main @ f65d7b0 Total commits analyzed: 1,205 (44 unique author emails → 35 distinct GitHub handles + 3 bots) Analysis basis: working copy is the dims/aicr2 fork; its main HEAD (f65d7b0eddcda…) is identical to upstream NVIDIA/aicr@main, so the local history faithfully represents upstream.

Methodology: Extracted every commit author via git log (email, name, date, and Signed-off-by trailer via %(trailers)) → resolved each email to a GitHub login through the upstream commit API (GET /repos/NVIDIA/aicr/commits/{sha}.author.login) → classified each handle by (1) Helios LDAP match, (2) @nvidia.com commit email, (3) NVIDIA GitHub-org membership (`GET /orgs/NVIDIA/member

@dims
dims / 1-2026-05-29-firecracker-ateom-poc-bigbox.md
Created May 29, 2026 19:18
Agent Substrate — pluggable ateom backend: Firecracker (microVM). [1] PoC on bigbox, [2] design proposal, [3] implementation log.

Firecracker ateom Backend — Working PoC on bigbox (counter demo)

Update (2026-05-29): this standalone PoC has since been turned into a full in-repo implementation (Phases 0–3) and a cluster e2e — a counter actor on a Firecracker worker driven through the real control plane (ate-api-server + atenet), state preserved across suspend/resume, on the existing kind cluster. Branch firecracker-backend (pushed to dims/substrate, commit bc533f5; worktree ~/go/src/github.com/agent-substrate/substrate-firecracker). Full journal: ~/notes/agent-substrate/2026-05-29-firecracker-backend-implementation-log.md. The PoC notes below are retained for the from-scratch microVM bring-up details (rootfs build, Firecracker API sequence, gotchas).

  • Date: 2026-05-29 · Host: bigbox (Ubuntu 24.04, AMD EPYC 7763, nested KVM) · Firecracker: v1.15.1 · Guest kernel: vmlinux-6.1.128
  • Goal: prove a Firecracker backend can satisfy substrate's ateom Run/Checkpoint/Restore contract, preserving
@dims
dims / host-managed-imex-design-v2.md
Last active May 29, 2026 17:04
Host-managed IMEX v2 design and operator guide

Design v2: Host-Managed IMEX, Minimal Alpha

Field Value
Status Implementable minimal alpha
Feature gate HostManagedIMEX
Scope Install-wide, not per-ComputeDomain
Primary goal Stop launching per-ComputeDomain IMEX DaemonSets when the host already runs nvidia-imex
Primary non-goal Per-ComputeDomain channel isolation across an IMEX fabric
# set PATH and check if cluster is present (all terminals)
export PATH=$HOME/go/bin:$PATH:
kubectl version
# ============================================================
# Terminal A — keep this running, watches and port-forwards.
# ============================================================
kubectl port-forward -n ate-system svc/atenet-router 8000:80 &
kubectl port-forward -n ate-openshell-m0 svc/openshell-gateway-substrate 50051:50051 &
@dims
dims / 2026-05-11-dra-driver-nvidia-gpu-external-contributors.md
Last active May 11, 2026 18:20
dra-driver-nvidia-gpu — External Contributor Report (2026-05-11)

dra-driver-nvidia-gpu — External Contributor Report

Generated: 2026-05-11 (rev. 2 — Helios cross-check added) Repo: kubernetes-sigs/dra-driver-nvidia-gpu Repo history: 2022-07-14 → 2026-05-11 (~3.8 years) Total commits analyzed: 1,853 (47 unique author emails) Methodology: Extracted all unique commit authors via git log → classified by email domain (@nvidia.com = NVIDIA, all others = candidates) → mapped commits to GitHub logins via GET /repos/.../commits/{sha} → verified every candidate against GET /orgs/NVIDIA/members/{username} (HTTP 204 = confirmed member, 404 = not a member) → for ambiguous cases, additionally cross-referenced against NVIDIA Helios LDAP (helios-cli user search) to detect NVIDIA employees who contribute via personal GitHub accounts not registered in the NVIDIA org → cross-referenced GitHub profiles, DCO Signed-off-by trailers, LinkedIn, and corporate-email patterns → folded NVIDIA-personal-e

@dims
dims / 2026-05-10-k8s-ci-failures-triage-v3.md
Created May 11, 2026 00:44
K8s CI triage runbook + v3 flakes report + v3 failures report (2026-05-10)

Kubernetes CI Failures — Triage Report (v3, independent)

Date: 2026-05-10 (PM) Source: failures-latest.json (HTML view: failures-latest.html). Snapshot: 231 jobs. Method: 10 parallel cluster-investigation agents → 1 independent cross-check verifier (8 claims: 6 CONFIRMED / 2 PARTIAL / 0 REFUTED) → live PR/issue state sweep on 56 references → drift detection against 2026-05-09 snapshot. Truly independent: no prior triage markdown was read; every claim re-derived from raw artifacts.

⚠️ Status banner:

  • 6 fix PRs merged today: k/k#138934 (coverage), k/k#138851 (ContainerMetrics), k/k#138584 (compat-versions, INCOMPLETE — needs release-1.36 cherry-pick), k/k#137936 (storage-kind), kops#18296 (upgrade-gossip), provider-aws-test-infra#550 (AMI build), cloud-provider-kind#407 (Pattern A digest pin).
  • Drift recovery: `ci-kubernetes-e2e
@dims
dims / 2026-05-05-kubernetes-security-findings.md
Last active May 5, 2026 18:08
Kubernetes Security Findings — May 2026

Kubernetes Security Findings — May 2026

Repository: kubernetes/kubernetes
Commit: 47f990437458a2b171f51b5e97a0c28c81d949d1 (master, 2026-05-05)
Methods: Static multi-agent source review (87 files across 4 researchers) + dynamic execution harness (kubectl, 3 agents)
Subsystems: authentication, authorization/RBAC, admission control/webhooks, node authorization (NodeAuthorizer + DRA graph)


Table of Contents

@dims
dims / kube-openapi-pr590-risk-analysis.md
Last active April 27, 2026 13:13
kube-openapi PR #590 risk analysis: go-openapi/swag v0.23.0→v0.25.4 behavioral deep-dive

kube-openapi PR #590 — Deep-Dive Risk Analysis

Upgrading go-openapi/swag v0.23.0 → v0.25.4

Prepared: 2026-04-27
PR: kubernetes/kube-openapi#590
Reviewer question: "go-openapi has some reputation of changing semantics without notification by accident. As we use it in our CRD validation there is risk that we break our API (we have forked the go-openapi validator nowadays, so risk is lower than in the past, but worth a check anyway)."


Executive Summary