I was looking for mini PCs with SFP+ and found a lot of fairly expensive small servers that were tempting. Then I got lucky and saw a new product coming out from minisforum, the MS-01, which had everything I needed at a much lower pricepoint.
I went with the 20 core intel i9-13900H but I think any of the three would have been fine for my needs.
This thread has a lot of great recommendataions on what is compatible.
I got six of these from ebay to run my Ceph OSDs against.
Went with 96GB which wasn't officially supported but works. There are many other options if you want to save some money. I ended up buying these on Newegg because Amazon had a 15 day wait.
Went with this for the boot drive because it was fast and cheap.
https://www.amazon.com/dp/B0CK39YR9V?psc=1&ref=ppx_yo2ov_dt_b_product_details
https://www.amazon.com/dp/B094STPLX3?ref=ppx_yo2ov_dt_b_product_details&th=1
For Ceph ring network
Currently going with 1 per device, not sure if seperating networks or LAN aggregation would be a good idea.
Navigate to Datecenter -> your server and then go to the next menu and select Updates -> repositories
Now that the repositories are up to date we can pull packages and update microcode which is critical for this device to function (per instert youtube video here)
First we need to add someting to one of the package sources:
in /etc/apt/sources.list add non-free-firmware to deb http://ftp.us.debian.org/debian bookworm main contrib
Save the file and take a snapshot of the microcode version with the following command:
grep 'stepping\|model\|microcode' /proc/cpuinfo
Now update the microcode by running:
$ apt clean
$ apt update
$ apt install intel-microcode
Reboot to apply the new microcode:
Once it comes back up run the grep again to see the new version of microcode:
grep 'stepping\|model\|microcode' /proc/cpuinfo
Confirm the version changed, probably only need to check the first hex value but I diffed the entire output:
This thread has some good benchmarks to compare against.
This file was edited
Then I did this
root@pve01:~# udevadm monitor
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent
KERNEL[732628.801886] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
KERNEL[732628.801908] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
KERNEL[732628.801923] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
UDEV [732628.804386] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
UDEV [732628.804544] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
UDEV [732628.804706] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
KERNEL[732628.805618] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
UDEV [732628.805780] remove /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
KERNEL[732633.651350] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
UDEV [732633.654275] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1 (thunderbolt)
KERNEL[732633.662074] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
KERNEL[732633.662088] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
UDEV [732633.662430] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_active0 (nvmem)
UDEV [732633.662937] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/usb4_port1/1-0:1.1/nvm_non_active0 (nvmem)
KERNEL[732637.971167] change /1-1 (thunderbolt)
UDEV [732637.973575] change /1-1 (thunderbolt)
KERNEL[732638.991250] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
UDEV [732638.991928] add /devices/pci0000:00/0000:00:0d.3/domain1/1-0/1-1 (thunderbolt)
Then I copy pasted these
nano /etc/systemd/network/00-thunderbolt0.link
[Match]
Path=pci-0000:00:0d.3
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en05
And for #2
nano /etc/systemd/network/00-thunderbolt1.link
[Match]
Path=pci-0000:00:0d.2
Driver=thunderbolt-net
[Link]
MACAddressPolicy=none
Name=en06
This part is still broken but to work around I added a few suggestions I found on the good gist to my interface file...
auto lo
iface lo inet loopback
# Begin thunderbolt edits
auto lo:0
iface lo:0 inet static
address 10.0.0.81/32
auto lo:6
iface lo:6 inet static
address fc00::81/128
# End thunderbolt edits
iface enp2s0f0np0 inet manual
auto vmbr0
iface vmbr0 inet static
address 192.168.0.34/24
gateway 192.168.0.1
bridge-ports enp2s0f0np0
bridge-stp off
bridge-fd 0
iface enp87s0 inet manual
iface enp90s0 inet manual
iface enp2s0f1np1 inet manual
iface wlp91s0 inet manual
# Begin thunderbolt edits
auto en05
allow-hotplug en05
iface en05 inet manual
mtu 65520
iface en05 inet6 manual
mtu 65520
auto en06
allow-hotplug en06
iface en06 inet manual
mtu 65520
iface en06 inet6 manual
mtu 65520
# End thunderbolt edits
source /etc/network/interfaces.d/*
# TB last line
post-up /usr/bin/systemctl reset-failed frr.service
post-up /usr/bin/systemctl restart frr.service
And then I added a super hacky cronjob at boot which fixed everything:
@reboot sleep 60 && /usr/bin/systemctl restart frr.service
HAOS dark magic - https://community.home-assistant.io/t/installing-home-assistant-os-using-proxmox-8/201835
We are going to have 3 workers fixed to each MS-01 using the host M.2 and one HA control plane (master) node which will not run pods but will make the control plane highly available so if one of the fixed workers go down it will be able to migrade pods.
Start with Debian This is good: https://i12bretro.github.io/tutorials/0191.html
Increasing storage is easy but decreasing can lead to catastrphic failure. Start small with storage!
Each MS-01 gets a beefy one:
And the MS-01 with only one VM (since we have windows and HAOS) can get the control plane. Make sure to use the Ceph VM Disks volume created in the good gist.
Add the control plane node to Datacenter -> HA -> Add and select the group created with the gist
First add yourself to sudo so you can do anything:
su root
nano /etc/sudoers
I read to intall qemu but it seems to be there, to make sure qemu you can run:
sudo apt update
sudo apt install qemu-guest-agent -y
sudo systemctl enable qemu-guest-agent --now
The last command doesn't seem to do anything but I'm not worried yet...
Enable qemu guest agent
Eject the ISO while you're at it:
Install NOMACHINE so we can copy paste!
Sice I couldn't copy paste I just went on the VM to firefox and downloaded https://downloads.nomachine.com/download/?id=1
Then run the command on the page:
sudo dpkg -i nomachine_8.11.3_4_amd64.deb
At this point you might as well jump ahead and disable swap since we should reboot and switch to nomachine.
This is also a good time to fix the IP (or configure in VM as static) and add a DNS record so we don't need the ip:
Install docker here: https://docs.docker.com/engine/install/debian/
They say to remove conflicts but we should be clean:
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done
Then add the libraries:
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
Now everything is ready to install:
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
And test!
sudo docker run hello-world
Read about K8 and get ready: https://kubernetes.io/docs/setup/
Get kubectl going: https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/
I recommend following the official document but the commands I ran are documented here for comparison...
Make sure to use amd64 packages for the Debian VMs. I ran these in ~/ but you may want to in Downloads.
Get installer:
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
Validate:
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
echo "$(cat kubectl.sha256) kubectl" | sha256sum --check
OK:
Install it since it's OK
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
Validate
kubectl version --client
It's a bit confusing here because it says to run kubectl cluster-info
but we have not even begun to join a cluster. However, scroll further and we will find some useful things to run.
Autocomplete script should dump if you run type _init_completion
. If so then run:
echo 'source <(kubectl completion bash)' >>~/.bashrc
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl > /dev/null
sudo chmod a+r /etc/bash_completion.d/kubectl
Then I installed the kubectl convert just for kicks.
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl-convert"
Validate
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl-convert.sha256"
echo "$(cat kubectl-convert.sha256) kubectl-convert" | sha256sum --check
Install
sudo install -o root -g root -m 0755 kubectl-convert /usr/local/bin/kubectl-convert
Validate it installed:
kubectl convert --help
Cleanup all the stuff curl pulled:
rm kubectl-convert kubectl-convert.sha256
rm kubectl kubectl.sha256
Install weird docker shit: https://mirantis.github.io/cri-dockerd/
Modify wget below for latest release from here: https://github.com/Mirantis/cri-dockerd/releases
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.13/cri-dockerd-0.3.13.amd64.tgz
tar -xvf cri-dockerd-0.3.13.amd64.tgz
sudo mv cri-dockerd/cri-dockerd /usr/local/bin/
#Clean Up
rm -R cri-dockerd
rm cri-dockerd-0.3.13.amd64.tgz
Check it:
cri-dockerd --version
Now get it running:
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.service
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.socket
sudo mv cri-docker.socket cri-docker.service /etc/systemd/system/
sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
sudo systemctl daemon-reload
sudo systemctl enable cri-docker.service
sudo systemctl enable --now cri-docker.socket
Check
systemctl status cri-docker.socket
Get ready for Cluster!
Using this doc: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm
Don't worry about the port yet. Disable swap first:
Run lsblk
and see swap.
Run free -h
and see space for swap.
sudo nano /etc/fstab
Comment out the line with swap in it:
reboot
Run lsblk
and see no more swap.
Run free -h
and see 0B for swap.
Install all this shit: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
Get confused by cgroup drivers, hope all is well
Now onto this shit that is like the first shit but more confusing: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
Install kubelet and kubeadm:
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
sudo systemctl enable --now kubelet
More about cgroup drivers, just skip until we see if we need it...
Before we can init the cluster we need to install a pod netwrok add-on.
Reference (general): https://k8s-docs.netlify.app/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
Another: https://theitbros.com/set-up-kubernetes-on-proxmox/
It gets a bit confusing here. I think flannel will be needed but that seems to come after the big one:
sudo kubeadm init
Something bad:
thaynes@kubevip:~$ sudo kubeadm init
Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
To see the stack trace of this error execute with --v=5 or higher
Try 2
sudo kubeadm init --cri-socket /var/run/cri-dockerd.sock
VICTORY
MISSED --pod-network-cidr=10.244.0.0/16
FIX:
See nothing from kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}'
?
Do this:
edit /etc/kubernetes/manifests/kube-controller-manager.yaml at command add --allocate-node-cidrs=true --cluster-cidr=10.244.0.0/16
Then restart to take the config:
sudo systemctl restart kubelet
Verify all good:
sudo systemctl status kubelet
But back to the action:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join REDACTED
Guess it's flannel time!
https://github.com/flannel-io/flannel
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
That worked, now add the nodes. Just use the command it spits out but append:
--cri-socket /var/run/cri-dockerd.sock
After some flannel troubleshooting we got healthy pods!
First off is the official dashboard:
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/
# Add kubernetes-dashboard repository
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
# Deploy a Helm Release named "kubernetes-dashboard" using the kubernetes-dashboard chart
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
It dunps out the goods:
thaynes@kubevip:~$ helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
Release "kubernetes-dashboard" has been upgraded. Happy Helming!
NAME: kubernetes-dashboard
LAST DEPLOYED: Wed May 8 22:25:51 2024
NAMESPACE: kubernetes-dashboard
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
*************************************************************************************************
*** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready ***
*************************************************************************************************
Congratulations! You have just installed Kubernetes Dashboard in your cluster.
To access Dashboard run:
kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
NOTE: In case port-forward command does not work, make sure that kong service name is correct.
Check the services in Kubernetes Dashboard namespace using:
kubectl -n kubernetes-dashboard get svc
Dashboard will be available at:
https://localhost:8443
First fowward:
kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
Then run:
kubectl -n kubernetes-dashboard get svc
It wants a token:
Grab one for default account under dasboard namespace:
kubectl -n kubernetes-dashboard create token default
And we're in!
But nothing good - make a user!
Need a file per code block:
The secret didn't work but this did:
kubectl -n kubernetes-dashboard create token admin-user
Will figure out the secret later. Got CPU and memory working by adding some args that are hard to find in the docs:
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard --set=service.externalPort=8080,resources.limits.cpu=200m,metricsScraper.enabled=true
https://github.com/ceph/ceph-csi
Steps look alright: https://github.com/ceph/ceph-csi/blob/devel/docs/deploy-cephfs.md
Prereqs will take some time figuring out though...
Your Kubernetes cluster must allow privileged pods (i.e. --allow-privileged flag must be set to true for both the API server and the kubelet). Moreover, as stated in the mount propagation docs, the Docker daemon of the cluster nodes must allow shared mounts.
OK THIS IS IT! https://devopstales.github.io/kubernetes/k8s-cephfs-storage-with-csi-driver/
- It says allow-privileged is now the default (after 1.1 release) so we're gonna go create the CephFS
We did something like that for ISOs here https://gist.github.com/scyto/941b24efd1ac0bf9b3cd30c3fb1e5341
Add a new CephFS from a node
Name it and add as storage:
You should see it hook to each VM:
Now let's see if we can use it.
But first, some cool tools!
https://github.com/ahmetb/kubectx
sudo apt install kubectx
Then install ceph
sudo apt-get install ceph
1 - give access to the public ceph network to the VM. Try jumbo frames if you use them. 2 - create ceph.conf and admin key file on vm. 3 - try to see ceph -s 4 - mount RBD or CephFS inside the vm
OK #1,
And it's fucked
Quick reload:
ifreload -a
First this:
1 - give access to the public ceph network to the VM. Try jumbo frames if you use them.
2 - create ceph.conf and admin key file on vm.
3 - try to see ceph -s
4 - mount RBD or CephFS inside the vm
Then this:
https://devopstales.github.io/kubernetes/k8s-cephfs-storage-with-csi-driver/
ON Flip ceph to private / public: https://www.reddit.com/r/Proxmox/comments/p7s8ne/change_ceph_network/