- create a new pool by first creating an empty partition space using
diskutil
orDisk Utility
- use cgdisk from homebrew if needed to reset the zpool partition id to
a504
& then reboot
zpool create -f \
-o ashift=12 \
-o failmode=continue \
-O atime=off \
-O compression=lz4 \
-O checksum=sha256 \
-R /zroot \
zroot /dev/disk0s4
diskutil list disk0
NB for FreeBSD there is no ashift support in zfs, it is handled at the lower GEOM layer, so check the device capabilities, then lock the lower bound before zpool creation:
diskinfo -v /dev/da0
sysctl vfs.zfs.min_auto_ashift=12
echo "vfs.zfs.min_auto_ashift=12" >> /etc/sysctl.conf
zpool status -v
root@akai / # zpool status -v
pool: tub
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tub ONLINE 0 0 0
disk0s2 ONLINE 0 0 0
errors: No known data errors
root@akai / # diskutil list
/dev/disk0
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *512.1 GB disk0
1: EFI 209.7 MB disk0s1
2: ZFS tub 511.8 GB disk0s2
/dev/disk1
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *121.3 GB disk1
1: EFI 209.7 MB disk1s1
2: Apple_HFS akai 120.5 GB disk1s2
3: Apple_Boot Recovery HD 650.0 MB disk1s3
/dev/disk2
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *3.0 TB disk2
1: EFI 209.7 MB disk2s1
2: Apple_CoreStorage 385.4 GB disk2s2
3: Apple_Boot Recovery HD 650.0 MB disk2s3
4: ZFS pond 2.6 TB disk2s4
/dev/disk3
#: TYPE NAME SIZE IDENTIFIER
0: Apple_HFS continuity *385.1 GB disk3
root@akai / # zpool attach tub disk0s2 disk2s4
root@akai / # zpool status -v
pool: tub
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 5.80% done, 1h17m to go
config:
NAME STATE READ WRITE CKSUM
tub ONLINE 0 0 0
mirror ONLINE 0 0 0
disk0s2 ONLINE 0 0 0
disk2s4 ONLINE 0 0 0
errors: No known data errors
root@akai / # zpool status -v
pool: tub
state: ONLINE
scrub: resilver completed with 0 errors on Sun Jan 13 15:52:21 2013
config:
NAME STATE READ WRITE CKSUM
tub ONLINE 0 0 0
mirror ONLINE 0 0 0
disk0s2 ONLINE 0 0 0
disk2s4 ONLINE 0 0 0
errors: No known data errors
root@akai / # sync
root@akai / # zpool offline tub disk2s4
root@akai / # zpool status -v
pool: tub
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scrub: resilver completed with 0 errors on Sun Jan 13 15:52:21 2013
config:
NAME STATE READ WRITE CKSUM
tub DEGRADED 0 0 0
mirror DEGRADED 0 0 0
disk0s2 ONLINE 0 0 0
disk2s4 OFFLINE 0 0 0
errors: No known data errors
root@akai / # zpool detach tub disk2s4
root@akai / # zpool status -v
pool: tub
state: ONLINE
scrub: resilver completed with 0 errors on Sun Jan 13 15:52:21 2013
config:
NAME STATE READ WRITE CKSUM
tub ONLINE 0 0 0
disk0s2 ONLINE 0 0 0
errors: No known data errors
root@akai / # zpool scrub tub
ZFS works successfully using >= 32 GiB SDXC cards in Feb 2011 MacBook Pro, and likely similar models.
- use Finder to eject disks
- if required, use
zfs unmount -f <pool>
&zfs pool export <pool>
to force - if the physical ReadOnly switch is enabled on media, zfs will fail to import them with
insufficient replicas
as an error:
$ zpool import
pool: builds
id: 11121869171413038388
state: FAULTED
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://www.sun.com/msg/ZFS-8000-3C
config:
builds UNAVAIL insufficient replicas
disk2s2 UNAVAIL cannot open
I have had 3 kernel panics during busy testing, none during data writing but all later after ejection in Finder without a subsequent pool export.
The oracle suggests an alternate approach for removable media usage.
If you've used zfs elsewhere, or are referring to the manpages, a few minor things are missing:
- zfs sharing and exporting doesn't work (iscsi, smb, nfs, afp via apple sharing)
- set up the receiving (listening) end first
mbuffer -I 192.168.1.1:10000 -q -s128k -m1G -P10 | zfs recv storage/foo
- now set up the sending end:
zfs send foo@052209 | mbuffer -q -s128k -m1G -O 192.168.1.2:10000
# duplicate everything for recovery purposes
zfs send -RLve zsource@snapshot | zfs recv -Fduv zroot
ZFS has an internal namespace (hierarchy) for filesystems, using a simple / delimiter within a filesystem name. Properties such as compression, mountpoints, and many other settings can be inherited through this namespace, or set and reset recursively. Other useful actions such as recursive snapshots are possible. Aligning these to roughly the same mapping as your filesystem will likely keep you sane & reduce your frustration.
- Reset the mountpoints under pool "tub", filesystem "shared" to inherit from the root:
zfs inherit -r mountpoint tub/shared
- Take snapshots of all subsidiary fileystems in pool "tub" and append the same suffix as snapshot name:
zfs snapshot -r tub@20120910
- recursive, forced, rollback to a snapshot will destroy all intermediate snaps and clones:
sudo zfs rollback -rf <snapshot>
This zfs cheatsheet is worth printing out.
The baseline
zfs create -o normalization=formD atime=off <name>
I set these on the base zfs dataset:
compression=lz4
because its pretty fast even if you don't need itchecksum=sha256
because it helps if you decide to use dedupe lateratime=off
because it saves writes and more performant
Finder and friends like spotlight want to abuse your ZFS filesystems. In particular:
- use
mdutil -i off <mountpoint>
to stop finder and spotlight trying to index ZFS. It won't work. - stop metadata being created using
cd <mountpoint> ; mkdir .fseventsd && touch .fseventsd/no_log
on the root mountpoint. - add
FILESYSTEMS="hfs ufs zfs"
to the end of/etc/locate.rc
to allowlocate
to index zfs filesystems.
mdutil -i off /zfs
cd /zfs
mkdir .fseventsd && touch .fseventsd/no_log
touch .Trashes .metadata_never_index .apdisk
Use locate instead for non-realtime searching of your z filesystems.
Is a necessary evil these days. In fact, I started using zfs after two issues - firstly
a nasty case of bitrot on photos, and later on, dropbox shitting on all my work and
trashing 1000s of files while I was travelling. zfs rollback
FTW.
sudo zfs create \
-o normalization=formD \
-o casesensitivity=insensitive \
-o com.apple.mimic_hfs=on \
zroot/users/dch/Dropbox
And follow the metadata tweaks above. Generally, it kinda works ok but sometimes gets stuck.
Uses zsh functions but should be easy to re-write for any other shell:
zdisk() {zpool create -O compression=lz4 -fR /zram zram \
`hdiutil attach -nomount ram://20971520`
sudo chown -R $USER /zram
cd /zram}
zdisk-destroy() {zpool export -f zram}
If you get a message about filesystem busy
or dataset busy
or similar, this script will find and release any remaining zfs locks:
#!/usr/bin/env perl
use Modern::Perl;
foreach my $fs (@ARGV) {
my ($refcount, @refs, %holds);
# get a hash of all subsidiary datasets with references
say "Looking up userrefs for $fs...";
@refs = qx(zfs get -H -r userrefs $fs)
or die "$~\n$^E\n";
foreach my $line (@refs) {
next unless $line =~ m/
(\S+)
\s+
userrefs
\s+
(\d+)
/igx;
unless ($2) {
say "clean: $1";
next;
}
else {
say "dirty: $1";
$holds{$1} = $2;
}
}
say "DONE\n";
say "Releasing holds...";
foreach my $dataset (keys %holds) {
# prune holds recursively
my @tags = qx(zfs holds -H -r $dataset)
or die "$!\n$^E\n";
foreach my $line (@tags) {
next unless $line =~ m/
(\S+) # dataset - could be different for a subsidiary
\s+
(\.send-\S+) # only prune tags left by `zfs send`
/igx;
my ($snapshot, $tag) = ($1, $2);
print "Releasing $snapshot from $tag...";
qx(zfs release -r $tag $snapshot)
or die "$!\n$^E\n";
say "ok"
}
}
say "DONE\n";
}
exit 0;
$ zpool status ⏎
pool: tank
state: ONLINE
scan: scrub repaired 0 in 15h54m with 0 errors on Sun May 25 15:47:45 2014
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
disk0s4 ONLINE 0 0 0
errors: No known data errors
pool: tub
state: ONLINE
scan: scrub repaired 0 in 0h36m with 0 errors on Wed May 28 22:05:40 2014
config:
NAME STATE READ WRITE CKSUM
tub ONLINE 0 0 0
disk4s2 ONLINE 0 0 0
errors: No known data errors
mkfile -n 480g mirror
import list
zpool import -f -R /mirror -N tub
zpool attach tub <existing-device> `pwd -P`/mirror
$ zpool status
pool: tank
state: ONLINE
scan: scrub repaired 0 in 15h54m with 0 errors on Sun May 25 15:47:45 2014
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
disk0s4 ONLINE 0 0 0
errors: No known data errors
pool: tub
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Jun 2 13:09:33 2014
453M scanned out of 349G at 22.7M/s, 4h22m to go
450M resilvered, 0.13% done
config:
NAME STATE READ WRITE CKSUM
tub ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
disk4s2 ONLINE 0 0 0
/zfs/shared/backups/akai/tub ONLINE 0 0 0 (resilvering)
errors: No known data errors
# zpool import -d /Volumes/sprawl/
pool: zpool
id: 411387556460089843
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
zpool ONLINE
/Volumes/sprawl/zpool ONLINE
# zpool import -R /zpool -N -d /Volumes/sprawl zpool
You're doing this because for some reason your hoster doesn't yet support all features of ZFS in their rescue disk and you foolishly ran zpool upgrade <pool>
and forgot something in your rc.conf script that prevents the system from completing boot to ssh.
- use
rescue_wintermute
and get back your mfsbsd shell
alias l='/bin/ls -aFGhl'
mkdir -m 0700 /root/.ssh
fetch -o /root/.ssh/authorized_keys http://people.apache.org/~dch/authorized_keys
chmod 0400 /root/.ssh/authorized_keys