A pretty sight which reminds me of city landscapes.

Homelab

I always knew my homelab would have to be full rack-scale, and not one of those telecom racks. I work with lots of hardware, and some unique hardware platforms, intimately. To work intimately, means, to work in a close manner, so I knew early on that the entire setup must be as quiet as possible.

In hindsight, I was surprised just how well it panned out at the end. I definitely didn't expect that I could work next to this thing with no physical barriers, which definitely came in handy a few times I would plug in through console. (Pro-tip; you should totally invest in USB-C to Serial cable, or two cables in a trenchcoat on top of one another, such as USB-C to RS232, and RS232 to RJ45.) I did install a Noctua industrial on top of the rack out back, aiming it at my face at 30° angle, before I ever had kilowatt dissipation; surprisingly, it continues to be a tremendous QoL improvement.

Excuse the mess!

I had de-railed the bottom server and brought it up top for physical measurements and testing. StarTech.com 18U rack cabinet is really amazing like that, and I've really enjoyed lollying it around 360°, as well as being able to hook up switches up front and out back, and it all fit nicely, too. Like all good things in life, this is a work-in-progress; most things are in a really good shape, but I will also go on about what capabilities are missing, and how I will get there eventually.

Somebody warned me that the choices I made, and equipment I brought up, may seem arcane to pretty much both SRE guys, and gamers, where liquid cooling is perhaps a bigger thing, or at least it used to be, even compared to conventional datacenters. I didn't start 30 months ago by mapping it all out. I knew that I really needed a root-of-trust server, and something to manage my home and work networks. So it has since evolved to support more projects, infrastructure, and capabilities.

Overview

I like to think my H-rack is packing with cool stuff, so allow me to start by listing big-picture, notable features.

Top to bottom:

Ubiquiti EdgeRouter 8 Pro is an eight-port OpenWrt-compatible, dual-core Gigabit router with modest hardware offloading chops, which works great for my /56 network (IPv6-PD) over GPON. I always prefer OpenWrt to proprietary networking firmware, and regularly-updated snapshot builds thereof for anything exposed to Internet. This router will remain viable while I'm stuck with Gigabit, and unable to upgrade to 10G uplink. Until then, I will be tracking progress on the upcoming rackmount version of a cool 10G router project by Tomaž Zaman. Shout-out to his Youtube channel where he's documented his whole process from idea of creating an open source, high-end 10G router from scratch, to prototype, manufacturing, and certification!
MikroTik CRS354 is the access switch for various router interfaces, whatever patches come through out back, and some downstream PoE switches, workstations, IP-cameras, and other sandpit VLAN's around my place. Mikrotiks are really cool! This switch has two 40G, four 10G ports, and sophisticated L3HW capabilities, inter-VLAN routing, VXLAN, IPv6-PD, and BGP. The 10G ports are nice for 10G-over-copper hardware that supports it, such as Mac studio. On a different note: Mac studio supports jumbo frames, including and over MTU 10218, which is what I use in most of my segments.
FS.com GPON ONU 1.244G/2.488G SFP based on lantiq chipset flushed with special firmware—allowing root access, and traffic-hardening at the border line—between your kingdom, and your ISP's. The green-colour GPON optic cable is a common fiber deployment in residential areas. Keep in mind that you do not have to do this hack; every ISP using GPON technology will install ONU free of charge. However, exercising control over the ONU may either be to your your network's benefit, or detriment alike. Let's leave it at that. Refer to Hack GPON website for more details.
MikroTik CRS504 (visible out back, opposing rightmost 40G access port) is a tidy little four-way 100G switch, the proverbial heart of this rack, pumping the vast majority of bandwidth-intensive routes at line rates. Mikrotiks are really amazing! It wasn't always the case, but these L3HW-capable switches support RoCE, VXLAN, and BGP. I didn't want to learn BGP at first, but once I had realised that these MikroTik/Marvell switches do not support VTEP's (see: VXLAN terminology) for IPv6 underlays in hardware, baby, it was time to BGP, hard. This warrants a blog post of its own, but suffice to say BGP eventually allowed me to mostly avoid L2 jazz for cloud-agnostic deployments without (a) having to give up segmentation, (b) regardless of the downstream peer's physical location.
Blackbird is my designated zero-trust IBM POWER9 server built from repurposed Supermicro parts, dual-redundant PSU's, and OpenPOWER motherboard based on 8-core SMT4 CPU, originally sold as Blackbird™ Secure Desktop by Raptor Computing in the US. Blackbird™ is technically a watered-down, single-socket version of Talos™. The OpenPOWER platform is arguably the most secure and transparent server platform in the world, and POWER9 remains the most advanced CPU to not include any proprietary blobs whatsoever. The POWER architecture ppc64el is well-supported by Debian maintainers: you would be surprised just how much is available. Oh, and it has great virtualization story: POWER IOMMU is really, really good. In my rack, it acts as the root of trust, and has some extra responsibilities, such as providing 42 TB HDD RAID6 in HBA mode. It has dual 25 GbE networking, courtesy of Mellanox Connect5-X. Most notably, it acts as internal CA and permission server, courtesy of OpenBao (open source fork of Hashicorp Vault) and Keto, open source implementation of Google's Zanzibar solution.
Rosa Siena (pictured opened up top) is the rack's powerhouse based on ASUS S14NA-U12 motherboard: a liquid-cooled, 48-core AMD EPYC 8434PN CPU, 384 GB DDR5, Broadcom NetXtreme-E dual 25 GbE, RoCE and VXLAN-capable NIC, two M.2 NVMe keys, PCIe 5.0 x16x16 + x8, & five x8 MCIO ports for NVMe expansion up to 10 disks. I installed AMD Virtex™ UltraScale+™ VCU1525 FPGA with custom water block (blower fans are annoying at full 225W draw) and dual 100G NIC's exposing host DMA for experimental networking, courtesy of Corundum—open hardware NIC design. I'm very happy with Siena (Zen 4c) cores, and the PN-variant specifically. I like my CPU to have many cores and bottom out at 155W, to make room for higher-power peripherals. I will be installing Tenstorrent Blackhole cards—64 GB, four 800G NIC's, liquid-cooled variant—as soon as they would hit the market, so it helps that be-quiet DARK POWER PRO 13 PSU is rated for 1600W and has two 12V-2x6 connectors.
Gembird UPS-RACK-2000VA is only 1200W, which has so far sufficed, but would soon need to be complemented by a second, higher-rated UPS to accommodate growing power requirements of storage, AI and networking accelerators as my homelab continues to evolve.

2023 Mac studio (96 GB) is not present in the rack, but it's a big part of how I interact with it; besides powerful GPU and lots of unified memory, it also has 10 GbE copper, supports VLAN, and Jumbo frames. They say it's good for LLM inference, and it's true, but honestly, M2 Max doesn't get enough credit for how it's immensely useful for virtualization: UTM is a way to run Windows and Linux VM's natively, and still, Rosetta 2 works! This is how I'm able to run Vivado on Apple Silicion, even though it only supports Linux and Windows on x86 systems.

📚 How to enable Rosetta 2 in Linux VM's powered by UTM

Networking

The networking alone deserves a separate blog post; besides what you see in the rack, there's multiple Wireguard peers, such as a few off-site locations with MikroTik routers, friends and family's, as well as occasional cloud-agnostic deployments in Hetzner, AWS and others.

I chose to embrace IPv6-PD, access VLAN's, and BGP for downstream network segmentation, Wireguard peers.

A lot of networking advice, it feels like, is either outdated, or largely ignoring capabilities of modern NIC's and switches. For example, Broadcom NetXtreme-E (25 GbE) NIC in my Rosa Siena server has hardware support for networking protocols such as RDMA-over-Converged-Ethernet, VXLAN, and virtualization capabilities, like SR-IOV over IOMMU. This kind of NIC lends nicely to a very particular style of networking.

However, because it supports something like VXLAN, doesn't mean that I will end up using it.

I must admit, I was tempted to Keep it Stupid, Simple in the L2 Neverland, but common sense got one over me. First of all, and I know I said Mikrotiks are great, but their VTEPs cannot do IPv6 underlays in hardware, and I'm really committed to IPv6. Besides: none of that crap is as good over Wireguard as good old BGP.

BMC/Shared Gigabit port

I will leave this as exercise to reader.

Let's say, one of the Gigabit Ethernet ports in your favourite Supermicro-like server is shared between the BMC and a host interface. So one physical port; two separate MAC's, differing by 1 byte. How do you put them on separate VLAN's? Bonus points: how do you make sure that BMC traffic always lands in the priority queue? To ensure the devices on management VLAN are always reachable. Maybe it's obvious, but I had fun when doing this in RouterOS.

Server: Blackbird

📚 Modern CPUs have a backstage cast by Hugo Landau is a great introduction to arcane ways of CPU management.

Modern CPU's, and by extension, motherboards—are very complicated, and needlessly opaque (covered under NDA's, like AMD EPYC, etc.) designs involving special coprocessors, such as the BMC, and intricate bring-up routines. In fact, you couldn't design a working motherboard for any modern CPU from publicly available datasheets even if you wanted to! Contrary to industry norms established by Intel and AMD, IBM's POWER9 processors chain-load into Petitboot environment from exclusively open source firmware, and Raptor's motherboard is using a public FPGA design for bring-up, paired with OpenBMC for board management. You get to build everything from scratch, imprint your own keys to CPU's write-protected memory, configure the bootloader's kernel however you like.

This will get you Secure Boot backed by hardware from first principles.

Trusted Boot

Better yet, OpenPOWER firmware has a more advanced security mode.

Trusted Boot will elevate your boot-time security model from passive (aka: verify and enforce) to proactive (aka: measure and record) operation, facilitated by the device in the Trusted Platform Module header. "When the system is booting in trusted mode, Trusted Boot MUST create artifacts during system boot to prove that a particular chain of events have happened during boot. Interested parties can subsequently assess the artifacts to check whether or not only trusted events happened and then make security decisions. These artifacts comprise a log of measurements and the digests extended into the TPM PCRs. Platform Configuration Registers (PCRs) are registers in the Trusted Platform Module (TPM) that are shielded from direct access by the CPU."

"Big time innovation! ISO 11889 (TPM standard) came out in 2003," says the skeptic.

So now your security rests with a proprietary TPM, right?

Well, it would have been the situation had NLnet not existed, but it does, so it's not the situation! TwPM is an open hardware TPM based on a curious little ECP5 FPGA board called OrangeCrab. The project was partially funded through the NGI Assure Fund under EU research grant 957073. I happened so to have one laying around, and having previously experimented with another FPGA-based security device (visibly plugged-in one of my photos) called TKey by Mullvad's hardware division Tillitis.

You should watch the FOSDEM 2023 presentation about TwTPM if you wish to learn more!

Confidential computing

So just like it's very tempting to extend VLAN's all over the place, or rather the opposite—erect overlays everywhere, or stick with whatever bad habit, zero trust methodology is creating interesting constraints, and if you see them as such, constraints—you will have a much better time figuring out your security model.

For example, all my management firmware and software requires TLS certificates.

There are established protocols for something like a MikroTik switch—to pick up TLS certificates over network. One way to accomplish this is to set up a CA (Certificate Authority) and have some software manage it for you, too. This is something that Hashicorp Vault makes easy, so I could pick up where it has left off with OpenBao and get a bunch of things, like identity, policy-making, audit—effectively for free.

Vault, and OpenBao, is dealing with something called RBAC (Role-based Access Control) policy.

It means that there's idenitity, like users, or service accounts, and roles that these accounts may represent. Active policies only determine what roles get to do what on which resources. Most zero trust-like permission servers rely on RBAC for policy-making. However, RBAC is far from perfect. Google's Zanzibar paper describes a more capable system, based on relationships instead of roles; this approach is known as ReBAC, and among other things, it succeeds in agentic AI environments where RBAC policies are ill-equipped to represent warrant-like behaviour whereas I would warrant my agent with some scope of my own permissions.

One successful open source implementation of Zanzibar is Keto.

SQ-series PSU swap mod

Have you ever heard a Supermicro server in action? The mighty fan walls, the sweet whine of little bastard counter-rotating fans in Supermicro's calling-card, 1U redundant PSU's. This amount of noise in working quarters is simply untenable! I replaced stock fans in the CPU cooler with slightly larger Noctua fans, and dialed the fan wall back a little, but the little, roaring PSU whiners in my Blackbird were a deal-breaker. I still have them as backup, but I have since replaced them with a similar PSU model from Supermicro, called SQ-series (Super Quiet) redundant PSU. I was sad when they arrived from eBay and I realised that the edge-connector tabs in my PSU's power delivery backplane are a little too short, when compared to SQ modules. So the newly-ordered stuff didn't fit. Half an hour of dremeling later, and viola, fits like a glove! The 80 Titanium rating really shows: SQ-series are lost in the surrounding noise.

Let this be a lesson to all whom embarked on working in the presense of Supermicro's.

Why no liquid cooling?

I do actually own a second, development Blackbird—for experimenting with chassis security and Arctic Tern.

IBM SecuFirm-socket mounting was never supposed to be liquid-cooled, as at the time it was designed it never occured to nobody that liquid cooling is cool, however we live in Carl Woffenden's world, and in this world he's manufactured an AM5 adapter bracket, and was kind enough to share one with me!

I technically have the water block, capability to install it, but I don't wish to see any more downtime on the server, and frankly, I currently don't see a solution to get liquid in and out of the Supermicro chassis. Maybe one day I will weld a poor man's CDU, and design the brackets for liquid passthrough with some fancy quick-disconnect parts, but today is not the day, so I think I will keep my 3U HSF fan assembly. What's more likely is this testing kit will eventually be retired as 2U server of some kind, requiring the use of liquid cooling to meet chassis clearance specification.

Server: Rosa Siena

I believe this server build is the perfect combination, or unusual case of bringing scrap and quality components together, crude and sophisticated approaches, to end up with something better than whatever you could find in the off-the-shelf market.

The essense of hacking!

This chassis was created from the body of very old Chieftec 4U server. I bought it almost by accident, to sweeten the deal on Supermicro, but I knew immediately that I would be throwing out the outdated disk boxes out front, and replacing them with internal radiator of some kind. I was determined to get into liquid cooling! To make a decision ever easier, I just happened to find a listing for VCU1525 FPGA with exact specs I was looking for, AND a custom water block so I wouldn't need to kill myself over unbearable whining of the devilish little blower fans.

Unfortunately, the boxes were structural. I am now left off with a radiator that just dangles there, and some fans zip-tied together into a wall, dangling behind the radiator. Thankfully, I have the perfect solution. I will use additive manufacturing, which is a fancy term for 3d printing, to create a bearing structure that would incorporate the radiator, appropriately-sized holes for outward-going cooling lines, and most importantly, a 10-disk M.2 NVMe backplane above the radiator that will use up all five MCIO x8 ports on the motherboard:

Why no Turin?

EPYC 8004, also known as Siena-class CPU, first came out exactly two years ago in September 2023.

You could make an argument that it's been surpassed by, say, EPYC 9005 AMD Turin generation of Zen 5 processors using the more conventional SP5 socket. Well, let's take a look at a 48-core Turin, AMD EPYC 9475F. This is a 400W CPU clocked at 3.65-4.8 GHz. It has double the memory bandwidth, double the channels, double the L3 cache, and it has true 512-bit AVX datapath, compared to double-pumping AVX-256 in my modest 8434PN.

To realise my Turin vision, I would have to buy a 4,400 euro processor, and at least six channels worth of DDR5-6000 memory to match 384 GB in Rosa Siena. All things considered, a fully-equipped Turin system with a motherboard reasonably able to use up the lanes, will set me back 9,000 euros minimum.

For reference: in reality, the motherboard, CPU, and six 64 GB DDR5-4800 sticks for this server cost me 4,000 euro. Yes, it doesn't have true 512-bit vector datapath, and it doesn't have as much memory bandwidth as the most recent systems, but it doesn't need to, as long as it can DMA packets to accelerators at line rate. Most importantly, it really doesn't need to dissipate 400W of heat; in the era of 300W and 450W accelerators, you don't want a CPU putting you too dangerously close to 1.4 kW total dissipation. Common sense. Simple common sense.

The PCIe x8x8 mechanical trick

PCIe 5.0 slots represent valuable real estate; this motherboard includes three such slots: x16x16x8.

I was very impressed by this motherboard early on because I knew it had dual 25 GbE and very capable Broadcom NIC's to boot, and I also knew that the built-in M.2 keys and MCIO will cover all my NVMe needs without requiring a RAID card (RAID cards are so 2016!) which is great.

It's a shame the FPGA would have to occupy the dual-width slot, right?

Wrong! As it turns out, it's not only PCIe 3.0 x16, but also PCIe 4.0 x8-compatible, i.e. it's actually an x8x8 card with physical x16 edge connector, but x8x8 nonetheless. The donor Chieftec came with a preinstalled crossbar, and it would be a shame not to use it!

One pleasant side-effect of liquid cooling is the vertical real estate it recoups by not requiring a tall CPU heatsink. So the crossbar comes in handy. The only issue with this ad-hoc PCIe card placement is having to deal with brackets: the FPGA in question exposes two 100G QSFP28 ports, and in normal configuration they're supposed to just stick out the chassis. However, if it were suspended from the crossbar, the cables would have to enter the chassis, which would be a big deal in the datacenter, but doesn't matter in the homecenter.

tucnak/HOMELAB.md

Select an option

No results found