Lou Springer louspringer

The Apple Problem: Trust, Opacity, and Data Continuity

Over the last couple of years, our frustration with Apple has not come from one isolated failure. It has come from a recurring pattern: Apple’s ecosystem increasingly behaves like a black box at exactly the moments when clarity, reversibility, and trust matter most.

The maddening part is that Apple still presents itself as the safe, integrated, consumer-friendly alternative. The brand promise is: it just works. The lived experience, too often, is: it probably worked, until it didn’t, and now you need to make a consequential decision without enough information to understand the consequences.

That gap is the problem.

We are not complaining because we are unwilling to learn the system. Quite the opposite. We have done the work. We have preserved evidence. We have read prompts carefully, traced settings, investigated backup paths, considered data-loss scenarios, and tried to behave like responsible users. The issue is that Apple’s design often denie

Microsoft 365 Contact Intake Fragmentation: Evidence From `contact@energration.com`

Summary

This is a concrete example of Microsoft 365 product fragmentation observed while trying to implement a simple business requirement.

One public inbound address: contact@energration.com
Delivery to a collaboration surface in Microsoft 365 / Teams
Usable by operators in Outlook and Microsoft 365

System

AGENTS.md (Energration eudorus repository root) — follow this document for how to use the eudorus repo (agents, tools, GitHub policy, Kiro/Codex workflow, steering, ontology, credentials). Verbatim copy below.

Discovered Agents & Tools

Hand-maintained. Full list of agent guidance artifacts and how they are generated or maintained: docs/agent_guidance_inventory.md. Run make agent-guidance-inventory to display it. Quick-start operator contract: docs/how_to_work_with_eudorus_codex.md (one-page execution playbook).

Workspace: /Volumes/lemon/gemini

Requirements: Kamizawa footgun (page cache and false OOM)

Introduction

On unified-memory systems (e.g. DGX Spark, UMA) or when repeatedly swapping models (7B ↔ 120B), Linux page cache can be reported as "used" memory by PyTorch/Ray/vLLM and similar runtimes. That causes false OOM or "free memory on device is less than desired" on startup even though the memory is reclaimable. This spec captures the mitigation as a first-class feature: drop page cache before model launch in the deploy flow so the runtime sees accurate free memory.

Requirements

Requirement 1: Drop page cache before model start in deploy flow

120B benchmark: tokens/s and interpretation

Endpoint: gx10 120B (Qwen 122B) at http://gx10-83fb.tail3dac72.ts.net:8002
Script: benchmark_120b_tokens_per_second.py

Quick reference

| What we measure | How |

GX10-83FB Telemetry (Prometheus + Grafana)

Telemetry from the GX10-83FB host is exported to Prometheus and visualized in Grafana. Prometheus and Grafana run in the eudorus observatory stack in Docker on Zane. Access from any machine on the network (GX10, vonnegut, etc.) must use Zane’s Tailscale hostname, not localhost.

Observatory on Zane (Docker + Nginx)

The observatory stack runs in Docker on Zane. Nginx is the router in front; deployment path on Zane:
/Users/lou/migration/rootfs/home/lou/observatory-deployment
Config: observatory/nginx/nginx.conf; compose: docker-compose.yml.

Goose configuration: accessing the LLM on gx10

Interrogation date: 2026-03-13

How to check status and whether the model is coming up: See LLM_STATUS_AND_HEALTH.md.

Goose config locations

File	Purpose

120B model serve runbook (gx10-83fb)

Host: gx10-83fb (Tailscale: gx10-83fb.tail3dac72.ts.net)
Port for 120B: 8002 (single active 120B service at a time)
Recommended stack: Qwen3.5 122B A10B + llama.cpp (Option D)

Authoritative artifacts to read first: docs/GX10_PORT_ASSIGNMENT.md, docs/evidence/gx10_runtime_baseline.json, ontology/configuration_management.ttl.

Required first step: Refresh runtime evidence with python3 scripts/capture_gx10_runtime_baseline.py, then run ./scripts/gx10_config_guard.py [--live] before any change. If it fails, fix the reported issues first.

Smoke test: 120B endpoint on gx10-83fb

Date: 2026-03-13
Endpoint: http://gx10-83fb.tail3dac72.ts.net:8002
Service: Qwen 122B A10B (llama.cpp) via systemd user unit qwen-122b

	#!/usr/bin/env python3
	"""
	Benchmark tokens per second and TTFT for the gx10 120B LLM endpoint (Qwen 122B).

	Calls POST /v1/chat/completions (non-streaming for throughput; optional streaming
	for TTFT). Supports multiple runs (mean ± std) and optional concurrent requests.

	Usage:
	python3 scripts/benchmark_120b_tokens_per_second.py [BASE_URL]
	python3 scripts/benchmark_120b_tokens_per_second.py --runs 5 --ttft --concurrent 2