Skip to content

Instantly share code, notes, and snippets.

@mgerdts
Last active September 17, 2024 13:05
Show Gist options
  • Save mgerdts/2fa6f05d3c6edd6bb7f7a9842af0579e to your computer and use it in GitHub Desktop.
Save mgerdts/2fa6f05d3c6edd6bb7f7a9842af0579e to your computer and use it in GitHub Desktop.
bhyve on SmartOS

Introduction

The following options that aren't in the kvm brand should work:

  • com1, com2
    • Can be set to tty-like devices or socket,/some/path.
    • If both are unset, com1 defaults to /dev/zconsole and com2 defaults to /tmp/vm.ttyb.
  • bootrom
    • Should be set to /usr/share/bhyve/BHYVE_UEFI.fd or /usr/share/bhyve/BHYVE_UEFI_CSM.fd
    • Defaults to /usr/share/bhyve/BHYVE_UEFI_CSM.fd
  • bhyve-opts
    • Any extra options that should be passed to bhyve

Other things that are different:

  • bhyve only supports 16 vcpus.
  • There is no automatic configuration of networking, unless you are using an image with a couple fixes to cloud-init. Details on one such image are found below. If not using the fixed cloud-init, either configure the guest to have networking statically configured or use an external DHCP server.
  • VNC is not yet supported. You must use a text console. zlogin -C is your friend. vmadm console works too, but ^] seems not to work.

Bits

Source code

Code is in dev-bhyve branches of illumos-joyent and smartos-live

Platform Images

Pre-built platform images can be found at /mgerdts/public/bhyve-20180208.

Images

CentOS 7

You can install a CentOS 7 image that should work well with this with:

img=462d1d03-8457-e134-a408-cf9ea2b9be96
url=https://us-east.manta.joyent.com/mgerdts/public/bhyve/images/$img
for file in manifest.json disk0.zfs.gz; do
    curl -o $file $url/$file
done
imgadm install -m manifest.json -f disk0.zfs.gz
rm manifest.json disk0.zfs.gz

Image creation process below

Ubuntu 17.10

You can install an Unbuntu 17.10 image that should work well with this with:

img=38396fc7-2472-416b-e61b-d833b32bd088
url=https://us-east.manta.joyent.com/mgerdts/public/bhyve/images/$img
for file in manifest.json disk1.zfs.gz; do
    curl -o $file $url/$file
done
imgadm install -m manifest.json -f disk1.zfs.gz
rm manifest.json disk1.zfs.gz

Image creation process below

See b6.json below for an example of a file to use with vmadm install

Ubuntu 14.04

img=209ec332-16e1-e47b-8e94-b2c57ec497e7
url=https://us-east.manta.joyent.com/mgerdts/public/bhyve/images/$img
for file in manifest.json disk0.zfs.gz; do
    curl -o $file $url/$file
done
imgadm install -m manifest.json -f disk0.zfs.gz
rm manifest.json disk0.zfs.gz

Image creation process below

It broke. Why?

Known problems

  • Most images won't configure networking automatically. See above.
  • VNC not yet supported
  • Larger memory allocations don't seem to set resource caps high enough to account for overhead. You may need to manually increase (or remove) values in the capped-memory resource with zonecfg.

It booted then immediately halted

See /zones/$uuid/root/tmp/zhyve.log.

vm_create: No such device or address

Have you run a kvm instance since your last reboot? To verify that you haven't, be sure you don't see this:

echo hvm_excl_holder::print | mdb -k
0xfffffffff83b6185 "SmartOS KVM"

Unable to setup memory (11)

Currently vmadm does not calculate the memory overhead required for bhyve properly. Work around with:

# zonecfg -z $u1
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5> select capped-memory
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5:capped-memory> info
capped-memory:
        [physical: 33G]
        [swap: 33G]
        [locked: 33G]
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5:capped-memory> set physical=64g
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5:capped-memory> set swap=64g
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5:capped-memory> set locked=64g
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5:capped-memory> end
zonecfg:1d5b8e7c-c004-4f69-bf0c-c98918e35bd5> exit

32 vCPUs requested but only 16 available

bhyve is hard coded to support at most 16 vcpus. Change your configuration.

virtual machine cannot be booted

This has been seen when no bootrom has been specified. Perhaps you converted a kvm zone to a bhyve zone and forgot to add this:

# zonecfg -z $u1 info attr name=bootrom
attr:
	name: bootrom
	type: string
	value: /usr/share/bhyve/BHYVE_UEFI_CSM.fd

The example above is for BIOS support. For UEFI, drop the _CSM.

How do I mdb or truss this?

Inside the zone, there's one process (zhyve) started by the kernel. The easiest way to debug the process is to incercept the first system call with dtrace, then attach truss or mdb to the process.

Verbosely, with truss.

#! /usr/sbin/dtrace -ws

/*
 * Watch for the first system call performed by the zhyve program and
 * then trace that program with truss.
 */
syscall:::entry
/execname == "zhyve"/
{ 
        /* Stop the program */
        stop();
        /* Use truss to start tracing the program.  This causes the program to continue. */
    	system("truss -t all -v all -w all -p %d", pid);
        /* Tell dtrace to exit once truss starts */
        exit(0);
}

Or a one-liner with mdb:

# dtrace -wn 'syscall:::entry / execname == "zhyve" / {stop(); system("mdb -p %d", pid); exit(0);}'
dtrace: description 'syscall:::entry ' matched 235 probes
dtrace: allowing destructive actions
CPU     ID                    FUNCTION:NAME
  0    795                 systeminfo:entry Loading modules: [ ld.so.1 ]
> $C
ffffbf7fffdff970 ld.so.1`sysinfo+0xa()
ffffbf7fffdffcb0 ld.so.1`setup+0xebd(ffffbf7fffdffe78, ffffbf7fffdffe80, 0, ffffbf7fffdfffd0, 1000, ffffbf7fef3a99c8)
ffffbf7fffdffdd0 ld.so.1`_setup+0x282(ffffbf7fffdffde0, 190)
ffffbf7fffdffe60 ld.so.1`_rt_boot+0x6c()
0000000000000001 0xffffbf7fffdfffc0()

Notice that this catches a system call that happens before main starts.

{
"alias": "b6",
"brand": "bhyve",
"resolvers": [
"8.8.8.8",
"8.8.4.4"
],
"ram": "1024",
"vcpus": "2",
"nics": [
{
"nic_tag": "external",
"ip": "172.26.17.206",
"netmask": "255.255.255.0",
"gateway": "172.26.17.1",
"model": "virtio",
"vlan_id": 3317,
"primary": true
}
],
"disks": [
{
"image_uuid": "38396fc7-2472-416b-e61b-d833b32bd088",
"boot": true,
"model": "virtio"
}
]
}

Creation of centos-7.4 image from centos-7.2 image

In host

Clone zones/centos-7.2@final to zones/centos-7.4

Present new zvol to exiting bhyve guest b8

In b8

Change uuids

[root@7180e700-3cba-cb89-eb82-ff14a51a62b2 ~]# xfs_admin -U generate /dev/vde1
Clearing log and setting UUID
writing all SBs
new UUID = d0df243d-38c7-4e22-bbc8-cac5d785afd2

[root@7180e700-3cba-cb89-eb82-ff14a51a62b2 ~]# xfs_admin -U generate /dev/vde3
Clearing log and setting UUID
writing all SBs
new UUID = 019dcd15-1823-46b4-8035-91665c3144a8

[root@7180e700-3cba-cb89-eb82-ff14a51a62b2 ~]# swaplabel -U $uuid /dev/vde2
[root@7180e700-3cba-cb89-eb82-ff14a51a62b2 ~]# echo $uuid
458c8cd1-940d-4f6c-8d4c-c5118b885d11

Create altroot, then chroot

mount /dev/vde3 /mnt
mount /dev/vde1 /mnt/boot
for i in dev sys proc tmp; do mount --bind /$i /mnt/$i; done
chroot /mnt

Update OS (in altroot)

vi /etc/yum.repos.d/CentOS-Base.repo
  enabled=0 on updates
yum update
vi /etc/yum.repos.d/CentOS-Base.repo

Fix grub

  • vi /etc/default/grub, fix serial
  • grub2-mkconfig -o /boot/grub2/grub.cfg
  • vi /boot/grub2/grub.cfg - clear out 30_os-proper section

Fix fstab

  • using uuids above

Fix /etc/issue

Centos 7.4

###Get out of chroot, umount

awk '$2 ~ /^\/mnt/ { print "umount", $2}' /proc/mounts | sort -r | sh -ex

In host

snapshot

zfs snapshot zones/centos-7.4@final

Test

[root@emy-17 ~]# zonecfg -z $(vm test72)
zonecfg:7476c02c-662a-47b5-a9a8-b2995cb2e805> info device
device:
        match: /dev/zvol/rdsk/zones/centos-7.2
        property: (name=boot,value="true")
        property: (name=model,value="virtio")
        property: (name=media,value="disk")
        property: (name=image-size,value="10240")
        property: (name=image-uuid,value="209ec332-16e1-e47b-8e94-b2c57ec497e7")
zonecfg:7476c02c-662a-47b5-a9a8-b2995cb2e805> select device match=/dev/zvol/rdsk/zones/centos-7.2
zonecfg:7476c02c-662a-47b5-a9a8-b2995cb2e805:device> set match=/dev/zvol/rdsk/zones/centos-7.4
zonecfg:7476c02c-662a-47b5-a9a8-b2995cb2e805:device> end
zonecfg:7476c02c-662a-47b5-a9a8-b2995cb2e805> exit

Grab the console in one window, then in another:

[root@emy-17 ~]# vmadm start $vm
Successfully started VM 7476c02c-662a-47b5-a9a8-b2995cb2e805

In the console window, be sure that keyboard input is accepted by grub

Once booted, verify network and routes are configured;

7476c02c-662a-47b5-a9a8-b2995cb2e805 login: root
   __        .                   .
 _|  |_      | .-. .  . .-. :--. |-
|_    _|     ;|   ||  |(.-' |  | |
  |__|   `--'  `-' `;-| `-' '  ' `-'
                   /  ;  Instance (CentOS 7.4 20180322)
                   `-'   https://docs.joyent.com/images/linux/centos

[root@7476c02c-662a-47b5-a9a8-b2995cb2e805 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: net0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether d2:ce:d1:c1:9e:ae brd ff:ff:ff:ff:ff:ff
    inet 172.26.17.72/24 brd 172.26.17.255 scope global net0
       valid_lft forever preferred_lft forever
    inet6 fe80::d0ce:d1ff:fec1:9eae/64 scope link 
       valid_lft forever preferred_lft forever
[root@7476c02c-662a-47b5-a9a8-b2995cb2e805 ~]# ip r
default via 172.26.17.1 dev net0 
2.0.0.0/8 via 172.26.17.2 dev net0 
169.254.0.0/16 dev net0 scope link metric 1002 
172.26.17.0/24 dev net0 proto kernel scope link src 172.26.17.72 

[root@7476c02c-662a-47b5-a9a8-b2995cb2e805 ~]# mdata-get sdc:nics
[{"interface":"net0","mac":"d2:ce:d1:c1:9e:ae","vlan_id":3317,"nic_tag":"external","gateway":"172.26.17.1","gateways":["172.26.17.1"],"netmask":"255.255.255.0","ip":"172.26.17.72","ips":["172.26.17.72/24"],"model":"virtio","primary":true}]

[root@7476c02c-662a-47b5-a9a8-b2995cb2e805 ~]# mdata-get sdc:routes
[{"linklocal":false,"dst":"2.0.0.0/8","gateway":"172.26.17.2"}]

Generate image

[root@emy-17 /zones/images]# uuid=`uuidgen`
[root@emy-17 /zones/images]# mkdir $uuid
[root@emy-17 /zones/images]# cd $uuid
[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# imgadm list name=centos-bhyve-7.2
UUID                                  NAME              VERSION   OS     TYPE  PUB
07a95c22-a229-4825-feca-d1e813292904  centos-bhyve-7.2  20180321  linux  zvol  2018-03-22

[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# cp ../38396fc7-2472-416b-e61b-d833b32bd088/manifest.json .
[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# zfs send zones/cents-7.4@final | pigz > disk0.zfs.gz
cannot open 'zones/cents-7.4': dataset does not exist
[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# zfs send zones/centos-7.4@final | pigz > disk0.zfs.gz
[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# digest -a sha1 disk0.zfs.gz
d547549426d31d4a5d697045e7a0bd480e1b48eb
[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# ls -l disk0.zfs.gz
-rw-r--r--   1 root     root     974060316 Mar 22 19:37 disk0.zfs.gz
[root@emy-17 /zones/images/f5bbee50-ae32-c5f6-8018-9699f276c637]# vi manifest.json
Update:
- uuid
- name
- version
- sha1
- size
- description

Install the image

imgadm install -m manifest.json -f disk0.zfs.gz

Test the image;

[root@emy-17 /zones/mg]# vmadm create -f test74.json
Successfully created VM 1f3adba3-4baf-c04b-a5ad-b99ee1790fe2
[root@emy-17 /zones/mg]# vmadm start $(vm test74)
Successfully started VM 1f3adba3-4baf-c04b-a5ad-b99ee1790fe2
[root@emy-17 /zones/mg]# cat test74.json
{
  "uuid": "1f3adba3-4baf-c04b-a5ad-b99ee1790fe2",
  "autoboot": false,
  "alias": "test74",
  "hostname": "test74",
  "vnc_port": 11074,
  "brand": "bhyve",
  "resolvers": [
    "8.8.8.8",
    "8.8.4.4"
  ],
  "ram": "2048",
  "vcpus": "2",
  "nics": [
    {
      "nic_tag": "external",
      "ip": "172.26.17.74",
      "netmask": "255.255.255.0",
      "gateway": "172.26.17.1",
      "model": "virtio",
      "vlan_id": 3317,
      "primary": true
    }
  ],
  "disks": [
    {
      "image_uuid": "f5bbee50-ae32-c5f6-8018-9699f276c637",
      "boot": true,
      "model": "virtio"
    }
  ]
}

Log in at the console, check networking.

I have an existing Centos 7 bhyve instance and I will be adding another disk from the same image. There will be UUID colisions in /, /boot, and swap.

First boot the existing image (b8). Update grub to not use the FS UUID for finding /.

b8# echo GRUB_DISABLE_LINUX_UUID=true >> /etc/default/grub
b8# grub2-mkconfig -o /boot/grub2/grub.cfg
b8# halt

Add /dev/vdb as an ephemeral disk and /dev/vdc as a clone of the source image.

host# c7=66d919a8-132a-11e7-a7b8-5b99fa122880
host# zfs clone zones/$c7@final zones/centos7
host# zfs snapshot zones/centos7@pristine
host# vmadm update $b8 -f add_disk.json
host# vmadm update $b8 -f add_disk_centos_7.json
host# vmadm start $b8

Update UUIDs. /boot and swap may be on /dev/vdc because of duplicate UUIDs.

b8# umount /boot
b8# swapoff -a
b8# xfs_admin -U generate /dev/vdc1
b8# xfs_admin -U generate /dev/vdc3
b8# swaplabel -U `uuidgen` /dev/vdc2
b8# mount /boot
b8# swapon -a

Mount /dev/vdc at the alternate root so that it can be fixed. Copy an RPM that will be needed.

b8# mount /dev/vdc3 /mnt && mount /dev/vdc1 /mnt/boot && for i in dev sys proc; do mount --bind /$i /mnt/$i; done
b8# cp ~mgerdts/cloud-init/cloud-init-17.2+33.gd40a3dc0-1.el7.centos.noarch.rpm  /mnt/root

Fix up things in the alternate root.

b8# chroot /mnt
altroot# xfs_admin -u /dev/vdc1
UUID = a2655899-9dbb-492e-8b88-449f25aa177f

altroot# xfs_admin -u /dev/vdc3
UUID = edaf5ff0-ab10-4c87-b2bb-90c952b6cc36

altroot# swaplabel /dev/vdc2
UUID:  b58cb6ad-3309-4c12-af53-f04aa38602cd

altroot# vi /etc/fstab
    (set uuids to the new uuids)

altroot# cat >/etc/default/grub <<EOF
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="serial"
GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
GRUB_CMDLINE_LINUX="tsc=reliable divider=10 plymouth.enable=0 console=ttyS0 crashkernel=auto"
GRUB_DISABLE_RECOVERY="false"
EOF

altroot# grub2-mkconfig -o /boot/grub2/grub.cfg

altroot# yum install -y /root/cloud-init-17.2+33.gd40a3dc0-1.el7.centos.noarch.rpm pyserial
altroot# echo 'datasource_list: [ SmartOS ]' > /etc/cloud/cloud.cfg.d/90_smartos.cfg

altroot# cd /etc/rc.d

altroot# rm -f rc.local
altroot# mv rc.local-backup rc.local

Unmount the alternate root.

b8# awk '$2 ~ /^\/mnt/ { print "umount", $2}' /proc/mounts | sort -r | sh -ex
+ umount /mnt/sys
+ umount /mnt/proc
+ umount /mnt/dev
+ umount /mnt/boot
+ umount /mnt

Back in the host, clone this disk to a test vm. Create an @final snapshot so that if all goes well, we archive what was tested rather than the test result.

host# zfs snapshot zones/centos7@try1
host# t=79062669-e229-e55d-960d-9b18d0fed8d0
host# zfs clone zones/centos7@try1 zones/$t-disk0
host# zfs snapshot zones/$t-disk0@final

Start the vm, be sure everything looks ok.

host# vmadm start $t; zlogin -C $t

Create the image.

host# vmadm stop $t
host# zfs send zones/$t-disk0@final | pigz > disk0.zfs.gz
host# ls -l disk0.zfs.gz
host# digest -a sha1 disk0.zfs.gz
host# imgadm info $c7 > manifest.json
host# vi manifest.json
host# imgadm create -m manifest.json -f disk0.zfs.gz

Image creation

This was created based on the Ubuntu Certified 17.10 image using roughly the following procedure. Note that my initial guest ($u1) has two disks already. The second disk is important, as cloud-init thinks that /dev/vdb is an ephemeral disk. It leaves /dev/vdc alone.

# u1=# uuid of an Ubuntu 17.10 kvm disk that I converted to bhyve
# uc=$(imgadm list -Ho uuid  name=ubuntu-certified-17.10)
# zfs clone zones/$uc@final zones/ubuntu1710

# vmadm update $u1 -f - <<EOF
{
  "add_disks": [
    {
      "path": "/dev/zvol/rdsk/zones/ubuntu1710",
      "nocreate": true,
      "boot": false,
      "model": "virtio",
      "block_size": 8192,
      "size": 1024
   }
 ]
}
EOF

# vmadm start $u1
# zlogin -C $u1

Now, in the guest

u1# mount /dev/vdc1 /mnt
u1# mount --bind /proc /mnt/proc
u1# mount --bind /sys /mnt/sys
u1# mount --bind /dev /mnt/dev
u1# cp /etc/default/grub /mnt/etc/defaul/grub
u1# cp ~mgerdts/cloud-init/cloud-init_all.deb /mnt/root/
u1# chroot /mnt
u1-chroot# update-grub
u1-chroot# dpkg -i /root/cloud-init_all.deb
u1-chroot# rm /root/cloud-init_all.deb
u1-chroot# exit
u1# umount /mnt/proc
u1# umount /mnt/sys
u1# umount /mnt/dev
u1# umount /mnt

In the host

# zfs snapshot zones/ubuntu1710@final
# zfs send zones/ubuntu1710@final | pigz > disk1.zfs.gz
@mgerdts
Copy link
Author

mgerdts commented Sep 17, 2024

I no longer work for Joyent and have had no involvement with Bhyve and related images for a few years. You will probably find helpful community members at one of the places listed in https://docs.smartos.org/mailing-lists-and-irc/.

@matyi-szabolcs
Copy link

Thanks i'll check it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment