BUG REPORT: Today's updates borked ZFS import of bpool

From /var/log/dpkg.log, the updates I received today:

libnvpair1linux:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
libuutil1linux:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
libzfs2linux:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
libzpool2linux:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
python3-pyzfs:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
zfs-initramfs:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
zfs-test:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
zfsutils-linux:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15
zfs-zed:amd64 0.8.3-1ubuntu12.14 0.8.3-1ubuntu12.15

As I always do, I rebooted after the updates were done. The machine booted just fine.

As I always do, after updates, I ran through a list of commands in Terminal to be sure the machine is operating normally.

The problem:
bpool isn't being imported. The machine boots normally, and I can import bpool after the machine boots, and there are no error messages from ZFS. rpool was imported... not sure how.

sudo systemctl status zfs-import-scan.service

● zfs-import-scan.service - Import ZFS pools by device scanning
     Loaded: loaded (/lib/systemd/system/zfs-import-scan.service; enabled; vendor preset: disabled)
     Active: inactive (dead)
  Condition: start condition failed at Tue 2023-04-18 23:31:35 CDT; 32min ago
             └─ ConditionPathExists=!/etc/zfs/zpool.cache was not met
       Docs: man:zpool(8)

Apr 18 23:31:35 HP-Laptop systemd[1]: Condition check resulted in Import ZFS pools by device scanning being skipped.

I'm not doing the ZFS pool importing via the usual cachefile (/etc/zfs/zpool.cache) method, as that's not recommended anymore... I'm doing it via scanning for the pools at boot. Apparently the updates set it to use the cachefile method again, but that cachefile is empty (at least, it should have been... perhaps the update changed that?).

journalctl -b shows:

Apr 18 23:31:35 HP-Laptop systemd[1]: Condition check resulted in Import ZFS pools by device scanning being skipped.
Apr 18 23:31:35 HP-Laptop systemd[1]: Reached target ZFS pool import target.
Apr 18 23:31:35 HP-Laptop systemd[1]: Mounting /boot...
Apr 18 23:31:35 HP-Laptop systemd[1]: Starting Install ZFS kernel module...
Apr 18 23:31:35 HP-Laptop mount[1935]: filesystem 'bpool/BOOT/ubuntu_9th59n' cannot be mounted, unable to open the dataset
Apr 18 23:31:35 HP-Laptop systemd[1]: Starting Wait for ZFS Volume (zvol) links in /dev...
Apr 18 23:31:35 HP-Laptop systemd[1]: boot.mount: Mount process exited, code=exited, status=1/FAILURE
Apr 18 23:31:35 HP-Laptop systemd[1]: boot.mount: Failed with result 'exit-code'.
Apr 18 23:31:35 HP-Laptop systemd[1]: Failed to mount /boot.
Apr 18 23:31:35 HP-Laptop systemd[1]: Dependency failed for Unattended Upgrades Shutdown.
Apr 18 23:31:35 HP-Laptop systemd[1]: unattended-upgrades.service: Job unattended-upgrades.service/start failed with result 'dependency'.
Apr 18 23:31:35 HP-Laptop systemd[1]: Dependency failed for /boot/efi.
Apr 18 23:31:35 HP-Laptop systemd[1]: Dependency failed for /boot/grub.
Apr 18 23:31:35 HP-Laptop systemd[1]: Dependency failed for Local File Systems.
Apr 18 23:31:35 HP-Laptop systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
Apr 18 23:31:35 HP-Laptop systemd[1]: boot-grub.mount: Job boot-grub.mount/start failed with result 'dependency'.
Apr 18 23:31:35 HP-Laptop systemd[1]: boot-efi.mount: Job boot-efi.mount/start failed with result 'dependency'.
Apr 18 23:31:35 HP-Laptop systemd[1]: Finished Install ZFS kernel module.

I've issued:
sudo zpool import bpool
sudo zpool set cachefile=none bpool
sudo zpool set cachefile=none rpool

I'm going to reboot to see if that resolves the issue. I'll report back.

[EDIT 1]
Nope, that didn't fix things. Doing a ZFS scrub shows there are no errors on the drives, but the errors above are worrying. Still investigating...

[EDIT 2]
Ok, I'm going to try this:
/etc/modprobe.d/zfs.conf

# Enforce synchronous scsi scan, to prevent zfs driver loading before disks are available
options scsi_mod scan=sync

[EDIT 3]
Nope, that didn't work. Reverting the change. I'm now trying:
/etc/default/zfs
Changed:
#ZFS_POOL_IMPORT="pool1;pool2"
... to:
ZFS_POOL_IMPORT="bpool;rpool"
Changed:
#ZPOOL_IMPORT_ALL_VISIBLE='no'
... to:
ZPOOL_IMPORT_ALL_VISIBLE='yes'

[EDIT 4]
Nope, didn't work. I'm keeping the last edits above and further trying:
Change:
#ZPOOL_IMPORT_OPTS="-c /usr/local/etc/zfs/zpool.cache"
#ZPOOL_CACHE=""
To:
ZPOOL_IMPORT_OPTS="-c /etc/zfs/zpool.cache"
ZPOOL_CACHE=""

I further renamed /etc/zfs/zpool.cache to zpool.cacheOLD, then issued:
sudo zpool set cachefile="/etc/zfs/zpool.cache" rpool
sudo zpool set cachefile="/etc/zfs/zpool.cache" bpool
... just to be sure zpool.cache had the correct content.

[EDIT 5]
Nope, didn't work. Reverting the changes.
Changed:
#ZFS_INITRD_ADDITIONAL_DATASETS="rpool/ROOT/usr rpool/ROOT/var"
... to:
ZFS_INITRD_ADDITIONAL_DATASETS="bpool/BOOT/ubuntu_9th59n"

[EDIT 6]
Nope, still didn't work... the latest update really broke ZFS importing. I'm rolling back to before the updates (if I can, given that ZFS itself is broken).

Ok, I'm back up and running. I keep images of each partition, so I rolled back to those images. ZFS was so broken that rolling back to an earlier snapshot didn't work.

Using the images of each partition takes awhile, since I've got 3 mirrored drives, each with multiple partitions, and because they're compressed, I have to uncompress them, then 'burn' them to each partition.

I'll wait a bit before doing any updates, hopefully the problem with ZFS will be fixed by then.

[EDIT 1]
As a comparison to the zfs-import-scan.service error above, here's what it looks like when it's running correctly...

sudo systemctl status zfs-import-scan.service

● zfs-import-scan.service - Import ZFS pools by device scanning
     Loaded: loaded (/lib/systemd/system/zfs-import-scan.service; enabled; vendor preset: disabled)
     Active: active (exited) since Fri 2023-04-21 14:25:52 CDT; 2h 24min ago
       Docs: man:zpool(8)
    Process: 1756 ExecStart=/sbin/zpool import -aN -o cachefile=none (code=exited, status=0/SUCCESS)
   Main PID: 1756 (code=exited, status=0/SUCCESS)

Apr 21 14:25:51 HP-Laptop systemd[1]: Starting Import ZFS pools by device scanning...
Apr 21 14:25:52 HP-Laptop systemd[1]: Finished Import ZFS pools by device scanning.

So I think they changed zfs-import-scan.service so a start condition was that the /etc/zfs/zpool.cache file doesn't exist? Not sure. If that's the case, then they should have determined that ZFS import was being done by boot-time scanning (rather than from the zpool.cache file) and deleted or renamed the zpool.cache file.

[EDIT]
On inspecting the files in the updates, they're just to fix a problem where ZFS doesn't change ARC size if ARC size is set below 1/32nd of total memory... they should have worked on my system. Unless there are other changes not explicated in the changelist.

I've tried a couple times to install those seven ZFS updates, each time ZFS borks, necessitating that I roll back.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.