Skip to content

Instantly share code, notes, and snippets.

@NiklasGollenstede
Last active October 18, 2023 04:37
Show Gist options
  • Save NiklasGollenstede/c74f92eb75b781d24a057965443866bb to your computer and use it in GitHub Desktop.
Save NiklasGollenstede/c74f92eb75b781d24a057965443866bb to your computer and use it in GitHub Desktop.
Ubuntu 20.04 on ZFS with LUKS/ZFS enc. + master PW/key and SSH, installation + cloning

Fully automated setup of Ubuntu 20.04 on ZFS as root file system, optionally with LUKS and/or ZFS native encryption via single master password/keyfile and SSH unlock.

Please read the content for more information.

Install Ubuntu 20.04 on ZFS

This is essentially:

  • an automated variant of the "official" OpenZFS guides for Ubuntu 20.04 on ZFS,
  • with master password and/or keyfile unlock for all disks and encryption modes (LUKS and/or ZFS), including root FS unlock via SSH (dropbear).
  • And it allows for multiple bootable installations connected to one computer, i.e. dual boot or moving disks around.

In the ideal case, all that is required is some up-front configuration, and then running a single script to install and configure the entire, bootable system.

The installation can be run from any Ubuntu 20.04 system, to which the target disks are connected. The most failsafe and therefore generally preferred option is to have only the necessary disks connected, boot a live USB/CD, and initiate the installation via SSH from a different machine. Advanced installation methods, like installing from or cloning an existing system, or using physical disks from within VMs, can be found below.

Status: ARCHIVED

I have been running this on my primary laptop and home server for about 1 to 1.5 years. Having ZFS as base for your system is pretty nice. There is the percieved peace of mind knowing that your data is safe, and the practical advantage when actually restoring something from a snapshot. Ubuntu's zsys integration unfortunately does not work well (for me). It is super slow and spammy (when using apt), I haven't used its abilities once, and it has caused headaches on multiple occasions. With Ubuntu, this seems to be as good as it gets; Ubuntu 21.10 hasn't seen much development towards ZFS since 20.04.

I am moving to NixOS (also with ZFS) now, and have migrated all but my desktop system(s). So far, NixOS seems to be much easier to highly modify and still maintain longer term.

Step 1: Prepare the Live System (example) {#preparation}

To install from an Ubuntu Desktop live USB/CD, boot it into "Try Ubuntu", press Ctrl+Alt+T, sudo su, passwd ubuntu, enter some password twice, apt-get install openssh-server -y, echo IP: $(hostname -I).
Alternatively, with a Ubuntu Server live USB/CD, boot up to the first prompt (and maybe look out for the SSH server keys printed out at some point), press Ctrl+Alt+F2 to switch to tty2, sudo su, passwd ubuntu-server, enter some password twice, echo IP: $(hostname -I). The choice of the live system does not affect the resulting installation.

Now from the main system ssh-copy-id ubuntu(-server)@<host>, (delete, verify and accept server key if prompted and acceptable), ssh (-server)ubuntu@<host> -t sudo su. When doing this on a trusted, local network, the the SSH server verification can also be skipped with ssh -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null ... (do not do this for remote systems or once the system is installed and should have a constant key).

Make sure the date is reasonably accurate and that no ZFS pools are imported (or software installations may fail).

Step 2: Configuration {#configuration}

Before the actual installation, review these variables, then paste the content in a root (SSH) shell. Note that any disks specified will be reformatted and overwritten, potentially even if they are still in use.

# { . <(cat << "#EOF" # copy from after the first #

## Disks
dataDisks=( # disk(s) for the main pool; »ls -al /dev/disk/by-id« may be helpful
    /dev/disk/by-id/ata-...
    /dev/disk/by-id/ata-...
    /dev/disk/by-id/ata-...
);
raidzMode=raidz # optional raidz mode to put the 'dataDisks' in
cacheDisk=/dev/disk/by-id/nvme-... # optional (SSD) disk for swap and cache
swapSize=24G # size of the swap partition, on the cacheDisk (if given) *or* on every dataDisks
bpoolSize= # optional, bpool partition size on every dataDisk, defaults to 1G <= diskSize * 5% <= 4G)
zpoolOptions=( # options for the creation of the root pool
    # -o ashift=12 # log2 of physical sector size (can be auto-detected)
    -O acltype=posixacl
    -O compression=lz4
    -O relatime=on -O atime=off
    -O dnodesize=auto -O normalization=formD -O xattr=sa
);

## System
# Name of the ZFS pool on the boot and main/root partition, respectively. The defaults (here and of the Ubuntu 20.04 installer) are »bpool« and »rpool«, but consider that it is a quite bad idea to ever have disks with two different pools of the same name connected to a single system. Even having a single non-booted »rpool«/»bpool« pair connected can cause issues. So if that situation is ever forseeable, unique names should be chosen.
bpool=bp_one; rpool=rp_one # any two different names
# Random ID used to name the system datasets. Predefining this means knowing the value, but is otherwise not useful.
UUID= # optional: six digit lowercase hex
# Install via »debootstrap« by default. For more options, see section "System Cloneing" and the usage in the script.
osInstallScript='debootstrap' # optional: string constant or a bash script
newHostName=server # hostname of the new OS
installLocale=en_US # system locale to set
installTimeZone=Europe/Berlin # time zone to set

## Encryption
# Password for the master LUKS keystore used to store the keys for the encryption below.
#masterPassword= # optional: any string (should be set directly in the shell)
# Additional keyfile used as an alternative key for the master LUKS keystore.
masterKeyfile= # optional: »<devicePath>:<filePath>«, e.g.: »/dev/disk/by-label/KEYS:/keyfile«
# Whether to use block-level LUKS encryption on the data (and cache) partitions. The only unencrypted data is the bootloader, and the boot partition(s)/pool (with the kernel and initrd).
encryptLuks=yes # enum: non-empty => encrypt, empty => don't encrypt
# Whether to use ZFS native encryption on the root pool (and its caches). Compared to LUKS, the dataset descriptions (names, properties) are stored in plain as well, but raw (i.e. encrypted) ZFS send/receive for encrypted backups (transfer and at rest) will be possible.
encryptZfs=yes # enum: non-empty => encrypt, empty => don't encrypt
# Encryption mode for any swap selected above.
encryptSwap=yes # enum: »temp« => use volatile key (disabling hibernation), other non-empty => use persistent LUKS, empty => default to »encryptLuks || encryptZfs«
# Options for keystore, LUKS data (and cache), and persistent swap encryption.
luksCryptoOptions=( -c aes-xts-plain64 -s 512 -h sha256 )
# Options for ZFS native encryption.
zfsCryptoOptions=( -O encryption=aes-256-gcm )
#EOF
); }

Step 3: Installation {#installation}

To format and encrypt the disks, and then install everything, including single step + ssh unlock (if any encryption is requested), run this script in the root shell that has the variables set:

# { (. <(tee /tmp/ubuntu-on-zfs.sh << "#EOF" # copy from after the first #
#!/usr/bin/env bash
set -eux

## (sanity) check and save configuration (except password)

bpool=${bpool:-bpool}; rpool=${rpool:-rpool}
if ( zpool list -H $rpool || zpool list -H $rpool ) &>/dev/null; then : "Pool name $bpool or $rpool already in use"; false; fi
UUID=${UUID:-$(dd if=/dev/urandom of=/dev/stdout bs=1 count=100 2>/dev/null | tr -dc 'a-z0-9' | cut -c-6)} # $(zsysctl list | grep -Po "rpool/ROOT/ubuntu_\K\S+")
osInstallScript=${osInstallScript:-debootstrap}

if [[ ( ! "$masterPassword" ) && ( "$masterKeyfile" || "$encryptLuks" || "$encryptZfs" || ( "$encryptSwap" && "$encryptSwap" != 'temp' ) ) ]]; then
    : 'Specifying any persistent encryption option requires the "$masterPassword" to be set'; false
fi
encryptLuks=${encryptLuks:+yes}
encryptZfs=${encryptZfs:+yes}
if [[ "$encryptSwap" && ! "$swapSize" ]]; then : 'No swap that could be encrypted'; false; fi
if [[ "$swapSize" ]]; then if [[ "$encryptSwap" != 'temp' ]]; then if [[ "$encryptLuks" || "$encryptZfs" ]]; then encryptSwap=yes; else encryptSwap=${encryptSwap:+yes}; fi; fi; fi
if [[ ! "$bpoolSize" ]]; then bpoolSize=$(($(blockdev --getsize64 $dataDisks) * 5 / 100 / 1024 / 1024)); bpoolSize=$((bpoolSize < 1024 ? 1024 : bpoolSize < 4096 ? bpoolSize : 4096))M; fi

mkdir -p /root/.config/ubuntu-on-zfs/

# "export" env vars for use below and documentation
printf '%s' "
## Disks
$(declare -p dataDisks)
$(declare -p raidzMode)
$(declare -p cacheDisk)
$(declare -p swapSize)
$(declare -p bpoolSize)
$(declare -p zpoolOptions)

## System
$(declare -p bpool)
$(declare -p rpool)
$(declare -p UUID)
$(declare -p osInstallScript)
$(declare -p newHostName)
$(declare -p installLocale)
$(declare -p installTimeZone)

## Encryption
declare -- masterPassword=${masterPassword:+'"<removed>"'}
$(declare -p masterKeyfile)
$(declare -p encryptLuks)
$(declare -p encryptZfs)
$(declare -p encryptSwap)
$(declare -p luksCryptoOptions)
$(declare -p zfsCryptoOptions)
" > /root/.config/ubuntu-on-zfs/ubuntu-on-zfs.env

# save this script
if [[ -e /tmp/ubuntu-on-zfs.sh ]]; then cp /tmp/ubuntu-on-zfs.sh /root/.config/ubuntu-on-zfs/ubuntu-on-zfs.sh; rm /tmp/ubuntu-on-zfs.sh; fi


## Install ZFS

apt-get update #&& apt-get upgrade -y # too many updates for the live system to keep in memory
export DEBIAN_FRONTEND=noninteractive
apt-get --yes install zfs-initramfs debootstrap gdisk mdadm
#service zed stop # TODO: Why exactly? This will cause issues when installing from an existing system.


## Partitioning

# wipe (RAID?) headers
for r in "${dataDisks[@]}"; do mdadm --zero-superblock --force "$r"; done
# clear partition tables
for r in "${dataDisks[@]}"; do sgdisk --zap-all "$r"; done
if [[ "$cacheDisk" ]]; then sgdisk --zap-all "$cacheDisk"; fi

if [[ "$encryptLuks" ]]; then rpoolPartType=8309; else rpoolPartType=BF00; fi

# for UEFI booting (data-part1)
for r in "${dataDisks[@]}"; do sgdisk -n1:1M:+512M           -t1:EF00 "$r"; done
# for keystore (data-part2)
if [[ "$masterPassword" ]]; then for r in "${dataDisks[@]}"; do
                               sgdisk -n2:0:+64M             -t2:8309 "$r";
done; fi
# for the boot pool (data-part3)
for r in "${dataDisks[@]}"; do sgdisk -n3:0:+${bpoolSize}    -t3:BE00 "$r"; done

if [[ "$cacheDisk" ]]; then
    # for rpool with LUKS (data-part4)
    for r in "${dataDisks[@]}"; do sgdisk -n4:0:0            -t4:$rpoolPartType "$r"; done
    # for encrypted swap (cache-part1)
    if [[ "$swapSize" ]]; then sgdisk -n1:0:+"$swapSize"     -t1:8200 "$cacheDisk"; fi
    # luks encrypted ZIL (cache-part2)
                               sgdisk -n2:0:+1G              -t2:8300 "$cacheDisk"
    # luks encrypted L2ARC (cache-part3)
                               sgdisk -n3:0:0                -t3:8300 "$cacheDisk"
else
    if [[ "$swapSize" ]]; then
        # for rpool with LUKS (data-part4)
        for r in "${dataDisks[@]}"; do sgdisk -n4:0:-"$swapSize" -t4:$rpoolPartType "$r"; done
        # for encrypted swap (data-part5)
        for r in "${dataDisks[@]}"; do sgdisk -n5:0:0            -t5:8200 -A5:set:63 "$r"; done
    else
        # for rpool with LUKS (data-part4)
        for r in "${dataDisks[@]}"; do sgdisk -n4:0:0            -t4:$rpoolPartType "$r"; done
    fi
fi
partprobe || true # there seems to be a race condition in partition scanning between the previous and the following commands, this seems to help


## LUKS encryption (or not)

keyfileDevice=$(printf '%s' "${masterKeyfile}" | cut -d: -f1); keyfilePath=$(printf '%s' "${masterKeyfile}" | cut -d: -f2)
mappedDataParts=(); mappedZilPart=''; mappedL2arcPart='';
if [[ "$cacheDisk" ]]; then mappedZilPart="${cacheDisk}-part2"; mappedL2arcPart="${cacheDisk}-part3"; fi

function luksSetup { # 1: rawDisk, 2: luksName
    </dev/urandom tr -dc 0-9a-f | head -c 64 > "$keystoreMount"/luks/"$2"
    cryptsetup --batch-mode luksFormat --key-file "$keystoreMount"/luks/"$2" --pbkdf=pbkdf2 --pbkdf-force-iterations=1000 "${luksCryptoOptions[@]}" "$1"
    cryptsetup --batch-mode luksOpen --key-file "$keystoreMount"/luks/"$2" "$1" "$2"
}
function mkmnt { # Makes a mount point that is read-only when not mounted. 1: path
    if mountpoint -q $1; then return 1; fi
    mkdir -p $1; chmod 000 $1; chattr +i $1
}

# create keystore (only on ${dataDisks[0]}-part2 for now)
if [[ "$masterPassword" ]]; then
    if [[ "$masterKeyfile" ]]; then mkdir /tmp/auto_unlocker; mount "$keyfileDevice" /tmp/auto_unlocker; fi
    printf '%s' "$masterPassword" | cryptsetup --batch-mode luksFormat --key-file - --pbkdf=pbkdf2 --pbkdf-force-iterations=1000 "${luksCryptoOptions[@]}" "${dataDisks[0]}-part2"
    if [[ "$masterKeyfile" ]]; then printf '%s' "$masterPassword" | cryptsetup --batch-mode luksAddKey --key-file - --pbkdf=pbkdf2 --pbkdf-force-iterations=1000 "${dataDisks[0]}-part2" "/tmp/auto_unlocker$keyfilePath"; fi
    printf '%s' "$masterPassword" | cryptsetup --batch-mode luksOpen --key-file - "${dataDisks[0]}-part2" "$rpool-keystore"
    if [[ "$masterKeyfile" ]]; then umount /tmp/auto_unlocker; rmdir /tmp/auto_unlocker; fi

    mkfs.fat "/dev/mapper/$rpool-keystore"
    keystoreMount=$(mktemp -d); mount -o nodev,umask=0077,fmask=0077,dmask=0077,rw /dev/mapper/$rpool-keystore "$keystoreMount"
    mkdir -p "$keystoreMount"/luks "$keystoreMount"/zfs
fi

if [[ "$encryptLuks" ]]; then
    for index in "${!dataDisks[@]}"; do disk="${dataDisks[$index]}";
        if [[ "$encryptLuks" ]]; then
            luksSetup "${disk}-part4" "$rpool-z${index}"
            mappedDataParts=("${mappedDataParts[@]}" "/dev/mapper/$rpool-z${index}")
        fi
    done
    if [[ "$cacheDisk" ]]; then
        luksSetup "${cacheDisk}-part2" "$rpool-zil";     mappedZilPart="/dev/mapper/$rpool-zil"
        luksSetup "${cacheDisk}-part3" "$rpool-l2arc"; mappedL2arcPart="/dev/mapper/$rpool-l2arc"
    fi
fi

if [[ "$encryptSwap" && "$encryptSwap" != 'temp' ]]; then
    if [[ "$cacheDisk" ]]; then
        luksSetup "${cacheDisk}-part1" "$rpool-swap"
        mkswap "/dev/mapper/$rpool-swap"
    else for index in "${!dataDisks[@]}"; do disk="${dataDisks[$index]}";
        luksSetup "${disk}-part5" "$rpool-swap${index}"
        mkswap "/dev/mapper/$rpool-swap${index}"
    done; fi
fi
if [[ ! "$encryptSwap" && "$swapSize" ]]; then
    if [[ "$cacheDisk" ]]; then
        mkswap "${cacheDisk}-part1"
    else for index in "${!dataDisks[@]}"; do disk="${dataDisks[$index]}";
        mkswap "${disk}-part5"
    done; fi
fi

if [[ "${#mappedDataParts[@]}" == '0' ]]; then mappedDataParts=("${dataDisks[@]/%/-part4}"); fi


## Create the boot pool (with only the features GRUB supports)

bootPoolCmd=(
    zpool create -o ashift=12 -d
    -o feature@async_destroy=enabled
    -o feature@bookmarks=enabled
    -o feature@embedded_data=enabled
    -o feature@empty_bpobj=enabled
    -o feature@enabled_txg=enabled
    -o feature@extensible_dataset=enabled
    -o feature@filesystem_limits=enabled
    -o feature@hole_birth=enabled
    -o feature@large_blocks=enabled
    -o feature@lz4_compress=enabled
    -o feature@spacemap_histogram=enabled
    #-o feature@userobj_accounting=enabled # this _was_ on the guide for (and works with) 18.04
    -o feature@zpool_checkpoint=enabled # and this is on the guide for 20.04
    -O acltype=posixacl -O canmount=off -O compression=lz4 -O devices=off
    -O normalization=formD -O relatime=on -O xattr=sa
    -O mountpoint=/boot -R /mnt
);
if [[ "$raidzMode" ]]; then bootPoolCmd=("${bootPoolCmd[@]}"
    $bpool "$raidzMode" "${dataDisks[@]/%/-part3}" -f
); else bootPoolCmd=("${bootPoolCmd[@]}"
    $bpool "${dataDisks[@]/%/-part3}" -f
); fi
"${bootPoolCmd[@]}"


## Create the root pool

dataPoolCmdPrefix=(
    zpool create "${zpoolOptions[@]}"
    -O canmount=off -O mountpoint=/ -R /mnt
);
if [[ "$raidzMode" ]]; then dataPoolCmdSuffix=(
    $rpool "$raidzMode" "${mappedDataParts[@]}" -f
); else dataPoolCmdSuffix=(
    $rpool "${mappedDataParts[@]}" -f
); fi
if [[ "$encryptZfs" ]]; then
    </dev/urandom tr -dc 0-9a-f | head -c 64 > "$keystoreMount"/zfs/rpool
    rpoolRawKey=$(cat "$keystoreMount"/zfs/rpool)
    "${dataPoolCmdPrefix[@]}" "${zfsCryptoOptions[@]}" -O keylocation=file://"$keystoreMount"/zfs/rpool -O keyformat=hex "${dataPoolCmdSuffix[@]}"
else
    "${dataPoolCmdPrefix[@]}" "${dataPoolCmdSuffix[@]}"
fi

if [[ "$masterPassword" ]]; then # create backups of (encrypted) keystore
    umount "/dev/mapper/$rpool-keystore"
    zfs create -V "$(blockdev --getsize64 "${dataDisks[0]}-part2")" $bpool/keystore
    sleep 1 # wait for device to exist
    dd status=none if="${dataDisks[0]}-part2" of=/dev/zvol/$bpool/keystore
    zfs set volmode=none $bpool/keystore # hide the zvol (and its snapshots) from /dev/
    for i in "${!dataDisks[@]}"; do if [[ "$i" != '0' ]]; then dd status=none if="${dataDisks[0]}-part2" of="${dataDisks[$i]}-part2"; fi; done
fi

if [[ "$cacheDisk" ]]; then # add caches
    zpool add $rpool log   "$mappedZilPart" -f
    zpool add $rpool cache "$mappedL2arcPart" -f
fi
zpool status


## Create Datasets

zfs create -o canmount=off -o mountpoint=none $rpool/ROOT
zfs create -o canmount=off -o mountpoint=none $bpool/BOOT

zfs create -o canmount=noauto -o mountpoint=/ \
    -o com.ubuntu.zsys:bootfs=yes \
    -o com.ubuntu.zsys:last-used=$(date +%s) \
    $rpool/ROOT/ubuntu_$UUID; zfs mount $rpool/ROOT/ubuntu_$UUID

# mkmnt /mnt/boot; # breaks booting (?)
zfs create -o canmount=noauto -o mountpoint=/boot \
    $bpool/BOOT/ubuntu_$UUID; zfs mount $bpool/BOOT/ubuntu_$UUID
mkmnt /mnt/boot/efi # mount defined later

mkmnt /mnt/etc                      ; zfs create $rpool/ROOT/ubuntu_$UUID/etc
mkmnt /mnt/home                     ; zfs create $rpool/ROOT/ubuntu_$UUID/home
mkmnt /mnt/opt                      ; zfs create $rpool/ROOT/ubuntu_$UUID/opt
mkmnt /mnt/snap                     ; zfs create $rpool/ROOT/ubuntu_$UUID/snap

mkmnt /mnt/srv; zfs create -o com.ubuntu.zsys:bootfs=no \
    $rpool/ROOT/ubuntu_$UUID/srv

#:     /mnt/usr; zfs create -o canmount=off -o com.ubuntu.zsys:bootfs=no \
#                                                 $rpool/ROOT/ubuntu_$UUID/usr
mkmnt /mnt/usr                      ; zfs create $rpool/ROOT/ubuntu_$UUID/usr
mkmnt /mnt/usr/local                ; zfs create $rpool/ROOT/ubuntu_$UUID/usr/local

#:     /mnt/var; zfs create -o canmount=off -o com.ubuntu.zsys:bootfs=no \
#                                                 $rpool/ROOT/ubuntu_$UUID/var
mkmnt /mnt/var                      ; zfs create $rpool/ROOT/ubuntu_$UUID/var
mkmnt /mnt/var/games                ; zfs create $rpool/ROOT/ubuntu_$UUID/var/games
mkmnt /mnt/var/lib                  ; zfs create $rpool/ROOT/ubuntu_$UUID/var/lib
mkmnt /mnt/var/lib/AccountsService  ; zfs create $rpool/ROOT/ubuntu_$UUID/var/lib/AccountsService
mkmnt /mnt/var/lib/apt              ; zfs create $rpool/ROOT/ubuntu_$UUID/var/lib/apt
mkmnt /mnt/var/lib/dpkg             ; zfs create $rpool/ROOT/ubuntu_$UUID/var/lib/dpkg
mkmnt /mnt/var/lib/NetworkManager   ; zfs create $rpool/ROOT/ubuntu_$UUID/var/lib/NetworkManager
mkmnt /mnt/var/log                  ; zfs create $rpool/ROOT/ubuntu_$UUID/var/log
mkmnt /mnt/var/mail                 ; zfs create $rpool/ROOT/ubuntu_$UUID/var/mail
mkmnt /mnt/var/snap                 ; zfs create $rpool/ROOT/ubuntu_$UUID/var/snap
mkmnt /mnt/var/spool                ; zfs create $rpool/ROOT/ubuntu_$UUID/var/spool
mkmnt /mnt/var/www                  ; zfs create $rpool/ROOT/ubuntu_$UUID/var/www

zfs create -o canmount=off $rpool/var; zfs create -o canmount=off $rpool/var/lib
mkmnt /mnt/var/lib/docker           ; zfs create $rpool/var/lib/docker

zfs create -o canmount=off -o mountpoint=/ \
    $rpool/USERDATA
mkmnt /mnt/root; zfs create -o canmount=on -o mountpoint=/root \
    -o com.ubuntu.zsys:bootfs-datasets=$rpool/ROOT/ubuntu_$UUID \
    $rpool/USERDATA/root_$UUID

# mkmnt /mnt/tmp # breaks the automatic import of bpool
zfs create -o sync=disabled $rpool/tmp; chmod 1777 /mnt/tmp

mkmnt /mnt/boot/grub
if [[ "${#dataDisks[@]}" -gt 1 ]]; then # for multiple disks, put /boot/grub on (in grub read only) zpool
    zfs create $bpool/BOOT/ubuntu_$UUID/grub
fi # otherwise, see bind mount below


## Install System

case "$osInstallScript" in
    interactive )
        : "ZFS file systems are ready and mounted on /mnt/. Copy/install a base system on it, then exit to resume with the the crypto/filesystem/bootloader configuration."
        : "Run »source /root/.config/ubuntu-on-zfs/ubuntu-on-zfs.env« to import the setup environment:"
        bash --login # --init-file <(printf '%s' '. /etc/bash.bashrc && . /root/.bashrc && . /root/.config/ubuntu-on-zfs/ubuntu-on-zfs.env')
    ;;
    debootstrap )
        debootstrap $(lsb_release --short --codename) /mnt
    ;;
    rsync-brpool|rsync-novirt )
        osSourceDir=${osSourceDir:-/}
        bold=$(mount | grep -oP '^\w+(?=/\S* on '$(<<< $osSourceDir grep -Po '^.*?(?=\/?$)')'/boot type zfs )')
        rold=$(mount | grep -oP '^\w+(?=/\S* on '$(<<< $osSourceDir grep -Po '^\/$|^.*?(?=\/?$)')' type zfs )')

        if [[ "$osInstallScript" == rsync-brpool ]]; then
            mount | grep -Po '^(?!('$bold'|'$rold')(?:\/\S+)?)\S+ on \K(\S+)' > /tmp/osclone-exclude
        else
            mount | grep -Po '\S+(?= type (?:proc|sysfs|devtmpfs|tmpfs|squashfs))' > /tmp/osclone-exclude
        fi
        ( e=0; rsync --archive --info=progress2 --exclude={/mnt/,/home/,/boot/efi/} $osSourceDir/ /mnt/ --exclude-from=/tmp/osclone-exclude || e=$? ; if [[ $e == 24 ]]; then exit 0; else exit $e; fi )

        snap=osclone-$(dd if=/dev/urandom of=/dev/stdout bs=1 count=100 2>/dev/null | tr -dc 'a-z0-9' | cut -c-6)
        zfs snapshot -r $rold/USERDATA@$snap
        ( set -ex ; for dataset in $(zfs list -rH -d 1 -t filesystem -o name $rold/USERDATA | grep -P '^(?!'$rold'/USERDATA(?:$|/root_))'); do
            zfs send --replicate --raw $dataset@$snap | zfs receive -F -e $rpool/USERDATA
        done )
        zfs destroy -r $rold/USERDATA@$snap
    ;;
    * )
        (. <(cat <<< "$osInstallScript")) # should copy system files from somewhere
    ;;
esac

zfs set devices=off $rpool # don't allow character or block devices on the file systems, where they might be owned by non-root users


## Link / copy some stuff

# copy SSH authorization
mkdir -pm0700 /mnt/root/.ssh
if [[ "${SUDO_USER:-}" ]]; then sudoHome=/home/"$SUDO_USER"; else sudoHome=/root; fi
if [[ -e "$sudoHome"/.ssh/authorized_keys ]]; then
    cp {"$sudoHome",/mnt/root}/.ssh/authorized_keys
    chown root:root /mnt/root/.ssh/authorized_keys
fi

# copy env vars for import
mkdir -p /mnt/root/.config/; rsync -a {,/mnt}/root/.config/ubuntu-on-zfs/

# prepare chroot
mkdir -p /mnt/dev  ; mountpoint -q /mnt/dev  || mount --rbind /dev  /mnt/dev
mkdir -p /mnt/proc ; mountpoint -q /mnt/proc || mount --rbind /proc /mnt/proc
mkdir -p /mnt/sys  ; mountpoint -q /mnt/sys  || mount --rbind /sys  /mnt/sys

# some diagnostics
lsblk
zpool status
zfs list
zpool version


## Enter chroot

#read -p "Check the above, then press Enter to continue or Ctrl+C to abort"
: chroot /mnt /bin/bash --login
 { set +x; chroot /mnt /bin/bash <(cat << '#EOS'
set -eux

. /root/.config/ubuntu-on-zfs/ubuntu-on-zfs.env # import env


## Network & login

# set hostname and DHCP
printf '%s\n' "$newHostName" > /etc/hostname
printf '%s\n' "127.0.1.1 $newHostName" >> /etc/hosts
printf '%s' "
network:
  version: 2
  ethernets:
    default:
      match: { name: '*' }
      dhcp4: true
" > /etc/netplan/01-eth-dhcp.yaml

# TODO: fixing this is required during the installation, but should be reverted afterwards (maybe rather create a tmpfs in /run/ and create the file at the target of the symlink in /run/; or bind moun the /run/ from the host?)
if [[ ! -e /etc/resolv.conf ]]; then
    rm -f /etc/resolv.conf
    printf '%s\n' 'nameserver 127.0.0.53' > /etc/resolv.conf
fi

# enable automatic login as root on screen
mkdir -p /etc/systemd/system/[email protected]/
printf '%s' '
[Service]
ExecStart=
ExecStart=-/sbin/agetty --noissue --autologin root %I $TERM
Type=idle
TTYVTDisallocate=no
' > /etc/systemd/system/[email protected]/override.conf


## Install software

repoBase=http://archive.ubuntu.com/ubuntu
printf '%s' "
deb     http://archive.ubuntu.com/ubuntu $(lsb_release -sc) main restricted universe
deb-src http://archive.ubuntu.com/ubuntu $(lsb_release -sc) main restricted universe

deb     http://security.ubuntu.com/ubuntu $(lsb_release -sc)-security main restricted universe
deb-src http://security.ubuntu.com/ubuntu $(lsb_release -sc)-security main restricted universe

deb     http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-updates main restricted universe
deb-src http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-updates main restricted universe

deb     http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-backports main restricted universe
deb-src http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-backports main restricted universe
" > /etc/apt/sources.list
ln -s /proc/self/mounts /etc/mtab || true # link mount table for (ancient) backwards compatibility
export DEBIAN_FRONTEND=noninteractive
apt-get update

# set locale
printf '%s\n' "locales locales/default_environment_locale select ${installLocale}.UTF-8" | debconf-set-selections
printf '%s\n' "locales locales/locales_to_be_generated multiselect ${installLocale}.UTF-8 UTF-8" | debconf-set-selections
rm "/etc/locale.gen"
dpkg-reconfigure --frontend noninteractive locales

# set time zone
ln -fs "/usr/share/zoneinfo/${installTimeZone}" /etc/localtime
dpkg-reconfigure --frontend noninteractive tzdata

apt-get --yes install linux-image-generic zsys
dpkg --purge os-prober # only used for dual boot, but creates warning in »update-grub«
apt-get --yes dist-upgrade
apt-get --yes install ubuntu-standard ${masterPassword:+dropbear-initramfs} nano curl patch zfs-initramfs cryptsetup dosfstools openssh-server grub-efi-amd64 grub-efi-amd64-signed shim-signed
apt-get --yes autoremove
apt-get update # for bash 'command not found' hints

addgroup --system lpadmin; addgroup --system lxd; addgroup --system sambashare


## Scripts for disk unlock on boot

# create unlock script: this is a bit hackish, but works reliably (so far)
keyfileDevice=$(printf '%s' "$masterKeyfile" | cut -d: -f1); keyfilePath=$(printf '%s' "$masterKeyfile" | cut -d: -f2)
keystoreDevice="${dataDisks[0]}-part2"
keystoreDevice=/dev/disk/by-uuid/$(blkid -s UUID -o value ${keystoreDevice})

unlockKeystoreScript=$(cat << '##EOS' | sed -r 's/^ {4}//' | sed -r "s|\\\$_keystoreDevice|$keystoreDevice|" | sed -r "s|\\\$_keyfileDevice|$keyfileDevice|" | sed -r "s|\\\$_keyfilePath|$keyfilePath|"
    #!/bin/sh
    set -eu

    ## Opens $keystoreDevice as /run/keystore to be used to unlock additional disks.
    ## To unlock $keystoreDevice, it first attempts to load $keyfilePath from $keyfileDevice
    ## as keyfile, then falls back to prompting for a master password.
    ## If set up and present, the entered password will be send through a YubiKey.
    ## When opening the keystore during boot from initramfs, consider to also
    ## include /scripts/init-bottom/lock-keystore.sh, which closes the keystore
    ## before leaving the initfamfs phase.

    keystoreDevice=$_keystoreDevice
    keyfileDevice=$_keyfileDevice
    keyfilePath=$_keyfilePath
    timeout=0.1; retry=200

    promptPassword () { # 1: message
        local message="Unlocking keystore $keystoreDevice.\n${1:-}Enter passphrase: "
        local prompt; if [ -x /bin/plymouth ] && plymouth --ping; then
            prompt="plymouth ask-for-password --prompt"
            message=$(printf "$message")
        else
            prompt="/lib/cryptsetup/askpass"
        fi
        $prompt "$message"
    }

    message() {
        if [ -x /bin/plymouth ] && plymouth --ping; then
            plymouth message --text="$*"
        else
            echo "$@" >&2
        fi
    }

    tryYubiKey () { # 1: key
        local key="$1" ; local slot
        if command -v ykinfo >/dev/null && command -v ykchalresp >/dev/null ; then
            if   [ "$(ykinfo -q -2 2>/dev/null)" = '1' ] ; then slot=2 ;
            elif [ "$(ykinfo -q -1 2>/dev/null)" = '1' ] ; then slot=1 ; fi
            if [ "$slot" ] ; then
                message "Unsing slot $slot of detected YubiKey ..."
                key="$(ykchalresp -$slot "$1" 2>/dev/null || true)"
                if [ "$key" ] ; then message "Got response from Yubikey" ; fi
            fi
        fi
        printf '%s' "$key"
    }

    mountKeystore () { # 1: key
        key="$(tryYubiKey "$1")"
        if [ -z "$key" ]; then message "No/empty key for keystore"; exit 1; fi
        printf '%s' "$key" | /sbin/cryptsetup --batch-mode luksOpen --key-file - "$keystoreDevice" "keystore"
        mkdir -p /run/keystore; mount -o nodev,umask=0077,fmask=0077,dmask=0077,ro /dev/mapper/keystore /run/keystore
    }

    # allow using the script without actually having a master keyfile
    if [ -z $keyfileDevice ]; then
        key=$(promptPassword) # "No keyfile configured.\n"
        mountKeystore "$key"; exit
    fi

    # wait for device to show up
    c=0; while [ ! -b $keyfileDevice ]; do
        sleep $timeout; c=$(expr $c + 1); if [ $c -gt $retry ]; then c=0; break; fi;
    done

    # ask for password if device still doesn't exist
    if [ ! -b $keyfileDevice ]; then
        key=$(promptPassword "Unable to locate device '$keyfileDevice'.\n")
        mountKeystore "$key"; exit
    fi

    mkdir -p /tmp/keymount; mount $keyfileDevice /tmp/keymount

    # ask for password if key file doesn't exist
    if [ ! -e /tmp/keymount$keyfilePath ]; then
        umount /tmp/keymount; rmdir /tmp/keymount
        key=$(promptPassword "Could not find file '$keyfilePath' on mounted '$keyfileDevice'.\n")
    else
        key=$(cat /tmp/keymount$keyfilePath)
        umount /tmp/keymount; rmdir /tmp/keymount
    fi

    mountKeystore "$key"
##EOS
)
if [[ "$masterPassword" ]]; then

    # include the above script in initramfs, to be called by
    # the crypttab/LUKS and init-premount/ZFS unlock scripts
    printf '%s' "
#!/bin/sh
if [ \"\$1\" = prereqs ]; then echo ''; exit 0; fi

cat << '###EOS' > \"\$DESTDIR\"/usr/sbin/unlock-keystore.sh
$unlockKeystoreScript
###EOS
chmod +x \"\$DESTDIR\"/usr/sbin/unlock-keystore.sh
" > /etc/initramfs-tools/hooks/unlock-keystore.sh
    chmod +x /etc/initramfs-tools/hooks/unlock-keystore.sh

    # to be included as keyscript= in crypttab
    cat << '##EOS' | sed -r 's/^ {8}//' > /usr/local/sbin/crypttab-unlock-with-keystore.sh
        #!/bin/sh

        if [ ! -e /run/keystore/luks ]; then
            printf '%s\n' "Unlocking keystore for $CRYPTTAB_NAME ($CRYPTTAB_SOURCE)." 1>&2
            /usr/sbin/unlock-keystore.sh
        fi

        cat /run/keystore/luks/"$CRYPTTAB_NAME"
##EOS
    chmod +x /usr/local/sbin/crypttab-unlock-with-keystore.sh

    # auto load keystore when using (only) ZFS encryption
    if [[ "$encryptZfs" ]]; then
        cat << '##EOS' | sed -r 's/^ {12}//' > /etc/initramfs-tools/scripts/init-premount/zfs-unlock-with-keystore.sh
            #!/bin/sh
            if [ "$1" = prereqs ]; then echo 'dropbear'; exit 0; fi

            tries=3; c=1; while [ ! -e /run/keystore/zfs ]; do
                printf '%s\n' "Unlocking keystore for ZFS, attempt $c of $tries." 1>&2
                /usr/sbin/unlock-keystore.sh
                c=$(expr $c + 1); if [ $c -gt $tries ]; then break; fi;
            done
##EOS
        chmod +x /etc/initramfs-tools/scripts/init-premount/zfs-unlock-with-keystore.sh
    fi

    # close keystore before leavinf initramfs
    cat << '##EOS' | sed -r 's/^ {8}//' > /etc/initramfs-tools/scripts/init-bottom/lock-keystore.sh
        #!/bin/sh
        if [ "$1" = prereqs ]; then echo ''; exit 0; fi

        printf "%s\n" "Unmounting and closing /dev/mapper/keystore post boot unlock." 1>&2
        umount /run/keystore || true
        umount /dev/mapper/keystore || true
        /sbin/cryptsetup close /dev/mapper/keystore || true
##EOS
    chmod +x /etc/initramfs-tools/scripts/init-bottom/lock-keystore.sh

    # re-configure SSH access to unlock root encryption
    rm /etc/dropbear-initramfs/* # will all be replaced
    printf '%s\n' 'DROPBEAR_OPTIONS="-p 22 -s -j -k -I 300"' > /etc/dropbear-initramfs/config
    # -p Listen on specified address and TCP port.
    # -s Disable password logins.
    # -j Disable local port forwarding.
    # -k Disable remote port forwarding.
    # -I Idle timeout in seconds.
    dropbearkey -t rsa -s 4096 -f /etc/dropbear-initramfs/dropbear_rsa_host_key # generate stronger key
    cp -a /root/.ssh/authorized_keys /etc/dropbear-initramfs/authorized_keys

    # replace cryptroot-unlock with a version that works with the keystore
    cat << '##EOS' | sed -r 's/^ {8}//' > /etc/initramfs-tools/hooks/replace-cryptroot-unlock
#!/bin/sh
if [ "$1" = prereqs ]; then echo ''; exit 0; fi

cat << '###EOS' | sed -r 's/^ {4}//' > "$DESTDIR/usr/bin/cryptroot-unlock"
    #!/bin/sh
    set -e

    if [ ! -e /run/keystore/luks ]; then
        echo 'Manually unlocking keystore.'
        /usr/sbin/unlock-keystore.sh
    else
        echo 'Keystore already unlocked.'
    fi

    printf '%s' 'Killing password prompts ...'
    kill -KILL $(ps | grep -m 1 '[p]lymouth ask-for-password --prompt' | awk '{print $1}') \
               $(ps | grep -m 1 '[/]lib/cryptsetup/askpass' | awk '{print $1}')
    printf '%s\n' ' done. Boot should resume now.'
###EOS
chmod +x "$DESTDIR/usr/bin/cryptroot-unlock"
##EOS
    chmod +x /etc/initramfs-tools/hooks/replace-cryptroot-unlock

fi


## crypttab & fstab

if [[ -e /etc/crypttab ]]; then mv /etc/crypttab{,.old}; fi
if [[ -e /etc/fstab ]]; then mv /etc/fstab{,.old}; fi
printf '%s\n' '# <target name> <source device> <key file> <options>' > /etc/crypttab
printf '%s\n' '# <block-dev> <mountpoint> <type> <options> <dump> <pass>' > /etc/fstab
if [[ "$masterPassword" ]]; then
    printf '%s\n' "#keystore /dev/zvol/bpool/keystore  none  luks,discard,noauto" >> /etc/crypttab
    printf '%s\n' "#/dev/mapper/keystore /run/keystore vfat nodev,umask=0077,fmask=0077,dmask=0077,ro,noauto 0 0" >> /etc/fstab
fi

crypttabOptions="luks,discard,initramfs,keyscript=/usr/local/sbin/crypttab-unlock-with-keystore.sh"

if [[ "$encryptLuks" ]]; then
    for index in "${!dataDisks[@]}"; do disk="${dataDisks[$index]}";
        printf '%s\n' "$rpool-z${index}     UUID=$(blkid -s UUID -o value ${disk}-part4)  none  $crypttabOptions" >> /etc/crypttab
    done
    if [[ "$cacheDisk" ]]; then
        printf '%s\n' "$rpool-zil    UUID=$(blkid -s UUID -o value ${cacheDisk}-part2)  none  $crypttabOptions" >> /etc/crypttab
        printf '%s\n' "$rpool-l2arc  UUID=$(blkid -s UUID -o value ${cacheDisk}-part3)  none  $crypttabOptions" >> /etc/crypttab
    fi
fi

if [[ "$swapSize" && "$masterPassword" ]]; then
    if [[ "$cacheDisk" ]]; then
        if [[ "$encryptSwap" != 'temp' ]]; then
            printf '%s\n' "$rpool-swap         UUID=$(blkid -s UUID -o value ${cacheDisk}-part1)  none  $crypttabOptions" >> /etc/crypttab
        else
            printf '%s\n' "$rpool-swap ${cacheDisk}-part1  /dev/urandom  swap,cipher=aes-xts-plain64,size=256" >> /etc/crypttab
        fi
        printf '%s\n' "/dev/mapper/$rpool-swap none swap sw 0 0" >> /etc/fstab
    else for index in "${!dataDisks[@]}"; do disk="${dataDisks[$index]}"
        if [[ "$encryptSwap" != 'temp' ]]; then
            printf '%s\n' "$rpool-swap${index}        UUID=$(blkid -s UUID -o value ${disk}-part5)  none  $crypttabOptions" >> /etc/crypttab
        else
            printf '%s\n' "$rpool-swap${index} ${disk}-part5  /dev/urandom  swap,cipher=aes-xts-plain64,size=256" >> /etc/crypttab
        fi
        printf '%s\n' "/dev/mapper/$rpool-swap${index} none swap sw 0 0" >> /etc/fstab
    done; fi
else if [[ "$swapSize" ]]; then
    if [[ "$cacheDisk" ]]; then
        printf '%s\n' "${cacheDisk}-part1 none swap sw 0 0" >> /etc/fstab
    else for index in "${!dataDisks[@]}"; do disk="${dataDisks[$index]}";
        printf '%s\n' "${disk}-part5 none swap sw 0 0" >> /etc/fstab
    done; fi
fi; fi

if [[ "$encryptZfs" ]]; then # set actual key file path
    zfs set keylocation=file:///run/keystore/zfs/rpool $rpool
fi


## GRUB Installation
# this only installs GRUB on the first disk, others come later

# install GRUB for UEFI booting
mkdosfs -F 32 -s 1 -n EFI ${dataDisks[0]}-part1
printf '%s\n' "PARTUUID=$(blkid -s PARTUUID -o value ${dataDisks[0]}-part1) /boot/efi vfat nodev,umask=0022,fmask=0022,dmask=0022 0 1" >> /etc/fstab
mkdir -p /boot/efi; mount /boot/efi
if [[ "${#dataDisks[@]}" -eq 1 ]]; then # for single disk, put /boot/grub on (in grub writable) efi partition
    mkdir -p /boot/efi/grub /boot/grub
    printf '%s\n' '/boot/efi/grub /boot/grub none defaults,bind 0 0' >> /etc/fstab
    mount /boot/grub
fi

# enable importing bpool
wget -qO- https://launchpadlibrarian.net/478315221/2150-fix-systemd-dependency-loops.patch \
| tee >(sha256sum -c <(echo 'e7b489260af7837bafb0a5ce10e22e758a0ac9dada8624b4318bb9eb1c579deb  -') || kill $$) \
| sed "s|/etc|/lib|;s|\.in$||" | (cd / ; patch -N -p1 || true)

[[ $(grub-probe /boot) == 'zfs' ]] && : 'GRUB recognized bpool'

update-initramfs -u -k all
# **Note:** When using LUKS, this will print "WARNING could not determine root device from /etc/fstab". This is because [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906).
# --> update-initramfs: Generating /boot/initrd.img-<version>-generic
# ... this is where things stop working with 19.10 ...

# grub config (the GRUB_CMDLINE_LINUX is relevant for ZFS)
GRUB_CMDLINE_LINUX_DEFAULT=""; GRUB_CMDLINE_LINUX="";
if [[ "$swapSize" && "$encryptSwap" != 'temp' ]]; then if [[ "$cacheDisk" ]]; then
    GRUB_CMDLINE_LINUX_DEFAULT+=" resume=/dev/mapper/$rpool-swap"
else
    GRUB_CMDLINE_LINUX_DEFAULT+=" resume=/dev/mapper/$rpool-swap0"
fi; fi
cat << '#EOC' | sed -r 's/^ {4}//' | sed -r "s|\\\$GRUB_CMDLINE_LINUX_DEFAULT|$GRUB_CMDLINE_LINUX_DEFAULT|" | sed -r "s|\\\$GRUB_CMDLINE_LINUX|$GRUB_CMDLINE_LINUX|" > /etc/default/grub
    # If you change this file, run 'update-grub' afterwards to update /boot/grub/grub.cfg.
    # For full documentation of the options in this file, see:
    #   info -f grub -n 'Simple configuration'

    GRUB_DEFAULT=0
    #GRUB_TIMEOUT_STYLE=hidden
    GRUB_TIMEOUT=5
    GRUB_RECORDFAIL_TIMEOUT=5
    GRUB_INIT_TUNE="480 440 1" # beep at grub start
    GRUB_TERMINAL=console
    #GRUB_GFXMODE=640x480 # resolution

    GRUB_DISTRIBUTOR=$(lsb_release -i -s 2> /dev/null || echo Debian)
    GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT"
    GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX"

    # Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
    #GRUB_DISABLE_LINUX_UUID=true
#EOC

# comment out the line »imported_pools=$(import_pools)«, because it can cause unexpected issues to just import (or fail at importing) any pool that happens to be connected
sed -e '/imported_pools=$(import_pools)/ s/^#*/#/g' -i /etc/grub.d/10_linux_zfs
#perl -i -pe 's/^(\s*imported_pools=[$][(]import_pools[)])/#${1}/' /etc/grub.d/10_linux_zfs

update-grub
# --> Generating grub configuration file ...
# --> Found linux image: /boot/vmlinuz-<version>-generic
# --> Found initrd image: /boot/initrd.img-<version>-generic
# --> done

grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=ubuntu --recheck --no-floppy

systemctl mask grub-initrd-fallback.service || true; # a temporary fix for some bug

# replicate GRUB
umount /boot/efi
for i in "${!dataDisks[@]}"; do if [[ $i != '0' ]]; then r=${dataDisks[$i]};
    dd if=${dataDisks[0]}-part1 of=${r}-part1
fi; done
mount /boot/efi


## Other ZFS stuff

# prepare automatic ZFS import/mount (this may fail to do anything, in which case manual mount on first boot is required)
if [[ -e /etc/zfs/zpool.cache ]]; then mv /etc/zfs/zpool.cache{,.old}; fi
if [[ -e /etc/zfs/zfs-list.cache ]]; then mv /etc/zfs/zfs-list.cache{,.old}; fi
mkdir -p /etc/zfs/zfs-list.cache
touch /etc/zfs/zfs-list.cache/$bpool
touch /etc/zfs/zfs-list.cache/$rpool
ln -sfT /usr/lib/zfs-linux/zed.d/history_event-zfs-list-cacher.sh /etc/zfs/zed.d/history_event-zfs-list-cacher.sh
( zed -F -p /tmp/zed.pid -s /tmp/zed.state -vf & pid=$! && zfs set canmount=noauto $rpool/ROOT/ubuntu_$UUID && sleep 30 && kill $pid ) && sleep 5
sed -Ei "s|/mnt/?|/|" /etc/zfs/zfs-list.cache/*

# avoid failing zfs-load-key-$rpool.service
mkdir -p /etc/systemd/system/zfs-load-key-$rpool.service.d/
cat << '##EOS' | sed -r 's/^ {4}//' | sed -r 's|\$rpool|'$rpool'|' > /etc/systemd/system/zfs-load-key-$rpool.service.d/override.conf
    [Service]
    ExecStart=
    ExecStart=/etc/systemd/system/zfs-load-key-$rpool.service.d/import-optional.sh
##EOS
cat << '##EOS' | sed -r 's/^ {4}//' | sed -r 's|\$rpool|'$rpool'|' > /etc/systemd/system/zfs-load-key-$rpool.service.d/import-optional.sh
    #!/bin/sh
    if [[ "$(/sbin/zfs get -H -o value keystatus $rpool)" -ne 'available' ]]; then
        /sbin/zfs load-key $rpool
    fi
##EOS
chmod +x /etc/systemd/system/zfs-load-key-$rpool.service.d/import-optional.sh
systemctl daemon-reload

# disable log compression
for file in "/etc/logrotate.d/"*; do
    perl -i -pe 's/(^\s*compress)/#$1/' "$file"
done

chmod 0600 /etc/zfs/zed.d/zed.rc # says it wants chmod 0600

# create first snapshot
update-initramfs -u -k all
zfs list -Ho name $bpool@install &>/dev/null || zfs snapshot -r $bpool@install
zfs list -Ho name $rpool@install &>/dev/null || zfs snapshot -r $rpool@install
update-grub

# exit from the 'chroot' environment back to the host environment
cat /etc/default/grub
ls /boot/grub/*/zfs.mod
zfs list
cat /etc/crypttab
cat /etc/fstab
cat /etc/zfs/zfs-list.cache/$bpool
cat /etc/zfs/zfs-list.cache/$rpool
efibootmgr -v || true
: "Installation done, but still in the new system-s chroot. Make any additional adjustments, then exit:"
bash -l
#EOS
); set -x; }; # end chroot
: "Left chroot, but file systems are still mounted in the host system. Make any additional adjustments, then exit:"
bash -l

# cleanly unmount all filesystems
mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {}
zpool export $bpool $rpool

: "Installation done. Can reboot to new system now!"
#EOF
)); }

Now reboot and log in.

Notes

  • zpool status may report that $bpool (esp. when it's not called "bpool") can be upgraded. Don't do that, or GRUB won't be able to open it any more.
  • Ubuntu already periodically runs zpool scrub zones to check ands fix stale data.

Troubleshooting

Common Issues {#common-issues}

  • Issue: After being used from a different system (e.g. the live system during installation), the pools fail to be imported on the next boot; the boot dropps to the (initramfs) shell.
    Problem: The pools were not exported, or were last used with a different name.
    Solution: Import the pools manually, e.g. zpool import -a or zpool import rpone rptwo, then reset.
  • Issue: Network Time Synchronization fails, boot offers the "emergency mode" shell.
    Problem: The /etc/zfs/zfs-list.cache/ is broken, ZFS file systems apart from the / one were not mounted.
    Solution: Use the emergency shell and run rm -rf /tmp/, zfs mount -a, zpool import bpool, and zfs mount bpool/BOOT/ubuntu_a1b2c3 (see zfs list). Finally check df to list the correctly mounted filesystems. Then exit to resume booting, and afterwards update-initramfs -u && update-grub to persistently fix the issue.
  • Issue: Boot get's stuck waiting for /dev/mapper/$rpool-swapX.
    Problem: [[no idea]]
    Solution: Just booting again may help.
    Problem: The decrypted swap device was not mkswaped correctly.
    Solution: Must format the opened swap device.
  • Issue: The only option offered by grub is to enter the firmware settings.
    Problem: Aparently the grub settings are broken or gone.
    Solution: Press Esc before the timer runs out, then in the GRUB> shell, run linux (hd?,3)/BOOT/ubuntu_??????/@/vmlinuz root=ZFS=rpool/ROOT/ubuntu_??????, initrd (hd?,3)/BOOT/ubuntu_??????/@/initrd.img, boot. The boot should resume as normal. Now fix the menu with update-grub.

Manual Mounting {#manual-mounting}

If the system doesn't boot (anymore), or to manipulate the non-booted system for other reasos, prepare a live system, and set (at least) the variables dataDisks, cacheDisk, bpool, rpool, UUID, encryptLuks, and encryptZfs from Configuration.

Now the following can be run to import the bpool+rpool, and to mount the file system (at $rootMount=/mnt/):

# { (. <(cat << "#EOF"
set -eux

: ${dataDisks} ${cacheDisk}
: ${bpool} ${rpool} ${UUID}
: ${encryptLuks} ${encryptZfs}

add-apt-repository universe && apt-get update
apt-get install --yes zfs-initramfs gdisk yubikey-personalization
# reload ZFS module without restart (of the live system)
#service zed stop; modprobe -r zfs; modprobe zfs; service zed start # (this should not be required on 20.04?)

# bpool import/rename
zpool import -N -R ${rootMount:-/mnt} $bpool -f

# unlock and mount keystore (from bpool)
( set -ex; if [[ "$encryptLuks" || "$encryptZfs" ]]; then
    zfs set volmode=geom $bpool/keystore; partprobe || true # "mount" the zvol; should revert the volmode= later, see below!
    cryptsetup luksOpen /dev/zvol/$bpool/keystore $rpool-keystore # when using a YubiKey with the unlock script, use the output of "ykchalresp -2 <passwd>" as password here
    mkdir -p /run/$rpool-keystore; mount -o nodev,umask=0077,fmask=0077,dmask=0077,rw /dev/mapper/$rpool-keystore /run/$rpool-keystore
fi )

# LUKS unlock
( set -ex; if [[ "$encryptLuks" ]]; then
    function luksSetup { # 1: rawDisk, 2: luksName
        cryptsetup --batch-mode luksOpen --key-file /run/$rpool-keystore/luks/"$2" "$1" "$2_$rpool"
    }
    for index in "${!dataDisks[@]}"; do disk=${dataDisks[$index]};
        if [[ "$encryptLuks" ]]; then
            luksSetup ${disk}-part4 $rpool-z${index}
        fi
    done
    if [[ "$cacheDisk" ]]; then
        luksSetup ${cacheDisk}-part2 $rpool-zil
        luksSetup ${cacheDisk}-part3 $rpool-l2arc
    fi
fi )

# rpool import/rename/unlock
zpool import -N -R ${rootMount:-/mnt} $rpool -f
if [[ "$encryptZfs" ]]; then
    zfs load-key -L file:///run/$rpool-keystore/zfs/rpool $rpool
fi

# mount root and boot these have canmount=noauto, but very much need to be mounted
zfs mount $rpool/ROOT/ubuntu_$UUID && zfs mount $bpool/BOOT/ubuntu_$UUID
# mount all descendants of rpool/ROOT/$installName any datasets directly in rpool/, if they have rpool as encroot and are auto-mounted
zfs list -rH -o encryptionroot,canmount,name $rpool | grep -Po '^(?:-|'$rpool')\ton\t\K'$rpool'/(ROOT/'ubuntu_$UUID'/|(?!ROOT/)).*' | xargs -i{} zfs mount {}
zfs list -rH -o encryptionroot,canmount,name $bpool | grep -Po '^(?:-|'$bpool')\ton\t\K'$bpool'/(BOOT/'ubuntu_$UUID'/|(?!BOOT/)).*' | xargs -i{} zfs mount {}
# could also do »zfs mount -o ro«

#EOF
)); }

Manual Fixing

After the manual import above, run this to enter a chroot to do manual fixing of the full system boot:

# { (. <(cat << "#EOF"
set -eux

# enter chroot
mount --rbind /dev  ${rootMount:-/mnt}/dev
mount --rbind /proc ${rootMount:-/mnt}/proc
mount --rbind /sys  ${rootMount:-/mnt}/sys
chroot ${rootMount:-/mnt} /bin/bash -c 'mount -a || mount -a || mount -a' # mount from (${rootMount:-/mnt})/etc/fstab
zfs mount -a # mount remaining ZFSs (TODO: this should not be necessary with 20.04+)
#chroot ${rootMount:-/mnt} /bin/bash --login
chroot ${rootMount:-/mnt} /bin/bash << '#EOS'

## do fixing
bash --login

update-initramfs -u -k all; update-grub

exit # from chroot
#EOS

# unmount and export everything
mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {}
umount /dev/mapper/$rpool-keystore; zfs set volmode=none $bpool/keystore
fuser -ki /mnt/
zpool export $bpool $rpool

#EOF
)); }

Advanced Installation / Cloning {#advanced}

The installer allows for the base installation of the new system to come from essentially any source, via the osInstallScript variable. This can be used to clone an existing installation to a different disk/pool/encryption/LUKS/key/swap setup, where the new topology can be chosen completely independently of the prevoius one, since all the /etc/{fs,crypt}tab and zpool importing will be re/written.

If the clone is to use a different set of names (zpools, luks devices), which in many situations is advisable anyway, the cloning could be done from within the system that is to be cloned. Though, to limit the potential of naming conflicts and other oddities, the cloning should usually rather be done from a system other than the one that is being cloned.

rsync into VirtualBox {#via-vbox}

This method uses a VM on the clone source to write to a disk passed through to it, copying via SSH. The advantage is that only the actual target disk will be available for writing, and that the clone source can remain operational, but copying a running system can lead to file system inconsistencies in the clone.

To do this with VirtualBox, chown the target block device (e.g. /dev/sdX, also works for a multi-disk setup) to the user running VBox, create a raw disk (.vdmk) wrapping that device, attach it to an empty VM, do not create any snapshots (even setting the disk to passthrough does not (always?) exclude it from snapshots), and boot into the live environment. Now, configure the installation with the VM internal names of the disks as target, and osInstallScript=interactive.

Start the installation and wait for the prompt to copy over the OS, do so via rsync, e.g.: (as root) rsync --archive --info=progress2 --exclude={/mnt/,/home/} / [email protected]:/mnt/ --exclude-from=<(mount | grep -Po '^(?!(bpool|rpool)(?:\/\S+)?)\S+ on \K(\S+)') -e 'ssh -p 12322 -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -i /home/user/.ssh/id_ed25519'.

Then exit and let the installation resume. rebooting in VBox after the installation should now work, possibly with some solvable issues on first boot. Finally, shut down the VM and host, or connect the target disk to a different computer, and boot into the clone!

rsync from Imported Pool

To get a more filesystem-consistent clone, shut down the source system, prepare a live system, and manualy mount the source pools at $rootMount=/os-source/). Now, configure the installation with the appropriate target disk names disk as target, and osInstallScript=interactive.

Start the installation and wait for the prompt to copy over the OS, do so via rsync, e.g.: (as root) rsync --archive --info=progress2 /os-source/ /mnt/. Note that while this will copy the /home/ directory (unless --exclude=/home/ is added), it does not create any additional datasets there, and it will also never copy the data of any seperately encrypted datasets (unless they were explicitly unlocked). This should be addressed later (TODO: elaborate).

Zfs' update-grub plugin gets confued when other system pools are imported, so unmount the source pool, before exiting and leting the installation resume. rebooting into the clone after the installation should now work, possibly with some solvable issues on first boot.

Misc

Here are some useful snippets to run after the first reboot, or later on:

## add fallback EFI entries
# TODO: for 20.04, see 4.7 and 5.6 at <https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2020.04%20Root%20on%20ZFS.html>
( set -eux; for index in "${!dataDisks[@]}"; do if [[ $index != '0' ]]; then r="${dataDisks[$index]}";
    efibootmgr -c -g -d "${r}" -p 2 -L "ubuntu-z${index}" -l '\EFI\ubuntu\grubx64.efi'
fi; done )

## backup the encrypted LUKS headers
( set -eux; for r in "${dataDisks[@]}"; do
    cryptsetup luksHeaderBackup "${r}-part4" --header-backup-file "luks-$(basename "$r")"-header.dat
done )
# now move those files somewhere else

Temporarily Remove the Cache Drive

## disable
zpool offline $rpool $rpool-zil
zpool offline $rpool $rpool-l2arc
swapoff -a
nano /etc/fstab # comment/remove swap entry
nano /etc/crypttab # comment/remove swap, zil, and l2arc entries

## check
cat /proc/swaps # should have no entries
zpool status $rpool # should show $rpool-zil and $rpool-l2arc as OFFLINE

## reboot
update-initramfs -u; update-grub; reboot # then check again

# NOTE: this may drop into initramfs recovery, because the ZIL isn't actually OFFLINE, but UNAVAIL
# fix: /sbin/zpool import $rpool -m; /sbin/zpool offline $rpool $rpool-zil; reboot

# to enable, reverse the disable steps and do the reboot step again
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment