ZFS, zero drives to get better backup .IMG file compression... while running Linux

This also works for those running VMs... VM size grows over time due to data that's left behind on now-unused sectors, zeroing those sectors allows the VM to be more like a 'sparse file', which reduces VM size.

This also works for those worried about potentially sensitive data left behind on now-unused sectors, even if the drive is encrypted (encryption can be broken... best to zero those now-unused sectors).

Ok, so I've mentioned in another post using zpool initialize to zero unused sectors via:
gnome-terminal -- /bin/sh -c 'set zfs:zfs_initialize_value=0; sudo zpool initialize bpool d7335f16-9bd1-1c4d-88b9-e952441dd227; sudo zpool initialize rpool 965d0a40-cce9-664d-8f4a-04c8075238c4 b34bba5d-f7ed-4d3e-95b5-47fd750e05f6 1a7428f8-4950-c248-b947-d8b817a0cd5a b5fd0c2c-0f02-9942-8576-d7b0b851fef1; while sudo zpool status | grep "initializing" > /dev/null; do clear; sudo zpool status -Td; sleep 2; done; clear; sudo zpool status -Td; sleep 15'

... the code that triggers the initialize is still buggy, however... you have to include the PARTUUID of each drive in the pool that you want to initialize (otherwise it says something like "Error: Cannot initialize rpool. 'hole' device not available".), and it doesn't really zero every bit of previously-used but now-unused space. It keeps a tally of where it's zero'd, and only works on areas that have changed.

So... let's brute-force zero every single sector of a hard drive (except for the EFI and bpool partitions, but they're small, so it shouldn't affect the size of the resultant compressed .IMG file much) so we can get really good compression ratios when doing a backup to a compressed .IMG file... and let's do it while running Zorin OS (ie: not booted from the Zorin OS boot USB stick).

To do this, you need a mirrored rpool... at least two drives running in parallel.

The conditions on my computer:


ZFS rpool is /dev/sda4 (PARTUUID: 965d0a40-cce9-664d-8f4a-04c8075238c4)

Mirror drive is /dev/sdb1 (PARTUUID: b34bba5d-f7ed-4d3e-95b5-47fd750e05f6)

Swap drive is /dev/sda2 (UUID: 46c1a133-bfdd-4695-a484-08fcf8286896)


Detach first drive from rpool: sudo zpool detach rpool 965d0a40-cce9-664d-8f4a-04c8075238c4

Ensure rpool is still good: sudo zpool status

Zero the drive: sudo dd if=/dev/zero of=/dev/sda4 bs=512 status=progress

Attach the zero'd drive to rpool: sudo zpool attach rpool b34bba5d-f7ed-4d3e-95b5-47fd750e05f6 965d0a40-cce9-664d-8f4a-04c8075238c4

Allow the automatic resilver to complete, monitor it with: sudo zpool status 5

Scrub rpool: sudo zpool scrub rpool
And monitor it with: sudo zpool status 5

Detach second drive from rpool: sudo zpool detach rpool b34bba5d-f7ed-4d3e-95b5-47fd750e05f6

Ensure rpool is still good: sudo zpool status

Zero the drive: sudo dd if=/dev/zero of=/dev/sdb1 bs=512 status=progress

Attach the zero'd drive to rpool: sudo zpool attach rpool 965d0a40-cce9-664d-8f4a-04c8075238c4 b34bba5d-f7ed-4d3e-95b5-47fd750e05f6

Allow the automatic resilver to complete, monitor it with: sudo zpool status 5

Scrub rpool: sudo zpool scrub rpool
And monitor it with: sudo zpool status 5

Unmount swap partition: sudo swapoff -v /dev/sda2

Zero swap partition: sudo dd if=/dev/zero of=/dev/sda2 bs=512 status=progress

Set up swap partition: sudo mkswap /dev/sda2 -U 46c1a133-bfdd-4695-a484-08fcf8286896 <== The original UUID of the swap partition

Mount swap partition: sudo swapon -a


Zeroing the sectors on the swap drive resets its UUID. Setting the UUID as done above means you don't have to mess with your /etc/fstab file, if you've set up your swap partition to be mounted like:
/dev/disk/by-uuid/46c1a133-bfdd-4695-a484-08fcf8286896 none swap sw,noatime 0 0


Then boot into the Zorin OS USB and create a .IMG file backup of the internal drive, compressing it with 7z.

I did an experiment... I backed up my internal drive (1 TB) to a .img.7z file (using dd chained to 7z) before doing the above, then again after. The 'before' .img.7z file is 8.4 GB in size... the 'after' .img.7z file is 2.8 GB.

ZFS is neat, and they call the bpool (boot pool) and rpool (root pool) 'pools' for a reason... because you can slosh data back and forth in that pool (to different drives) while up and running. As shown above, you can even zero the sectors of a partition while up and running.

The main advantages of using a mirrored rpool is that you increase your read speed (to increase your write speed, you'd use mirrored SLOG drives), you have data redundancy, and you can take one of the drives out of the pool without losing data... which means if you have a drive fail, you can detach that drive from the pool, shut down, put a new drive in, boot the machine, reattach the new drive to the pool and be up and running again in short order. If you've got a sophisticated enough computer, you can even do that without having to shut down (hot-swap capability).

1 Like

You can also do:
Detach first drive from rpool: sudo zpool detach rpool {Drive 1 PARTUUID}

Ensure rpool is still good: sudo zpool status

Attach first drive to rpool: sudo zpool attach rpool {Drive 2 PARTUUID} {Drive 1 PARTUUID}

Allow the automatic resilver to complete, monitor it with: sudo zpool status 5

Scrub rpool: sudo zpool scrub rpool
And monitor it with: sudo zpool status 5

Detach second drive from rpool: sudo zpool detach rpool {Drive 2 PARTUUID}

Ensure rpool is still good: sudo zpool status

Attach second drive to rpool: sudo zpool attach rpool {Drive 1 PARTUUID} {Drive 2 PARTUUID}

Allow the automatic resilver to complete, monitor it with: sudo zpool status 5

Scrub rpool: sudo zpool scrub rpool
And monitor it with: sudo zpool status 5

Doing the above apparently deletes the progress data that zpool initialize keeps, so it starts over from the beginning. If you don't delete that progress data, on subsequent runs zpool initialize believes it's already done, and exits quickly, so sectors don't get zero'd.

Then run:
gnome-terminal -- /bin/sh -c 'set zfs:zfs_initialize_value=0; sudo zpool initialize bpool {bpool PARTUUID}; sudo zpool initialize rpool {rpool Drive 1 PARTUUID} {rpool Drive 2 PARTUUID} {rpool Drive 3 (SLOG) PARTUUID} {rpool Drive 4 (SLOG) PARTUUID}; while sudo zpool status | grep "initializing" > /dev/null; do clear; sudo zpool status -Td; sleep 2; done; clear; sudo zpool status -Td; sleep 15'
... to zero the rpool free space.


Record the UUID (not PARTUUID) of the swap partition for later use:
sudo blkid

Unmount swap partition: sudo swapoff -v /dev/sda2 <== The swap partition

Zero swap partition: sudo dd if=/dev/zero of=/dev/sda2 bs=512 status=progress

Set up swap partition: sudo mkswap /dev/sda2 -U {Swap Partition UUID} <== The original UUID of the swap partition

Mount swap partition: sudo swapon -a

Zeroing the sectors on the swap drive resets its UUID. Setting the UUID as done above means you don't have to mess with your /etc/fstab file, if you've set up your swap partition to be mounted like:
/dev/disk/by-uuid/{Swap Partition UUID} none swap sw,noatime 0 0

Now boot into Zorin OS USB stick and run BackupToZip.sh.

1 Like

I've created a script that does all of the above... if you want to use it, you'll have to edit it to reflect your drive UUIDs, PARTUUIDs and device paths.

It's here. I've set it up as a keyboard shortcut in Zorin menu > Settings > Keyboard Shortcuts.

[EDIT]
I've since added another mirror drive to the rpool... the script quickly gets complicated and lengthy (and takes a long time to run) with multiple drives. I've got a feature request in with OpenZFS to automatically erase the zpool initialize progress data once it's finished initializing, so it can be run multiple times, as that has several advantages... it reduces the size of VMs, it erases potentially sensitive data sitting on now-unused sectors, and it makes your compressed backup .IMG files a lot smaller.

With all the stuff I strip out on a new install, and with zeroing the drives via zpool initialize, the backup .img file now compresses to a mere 2.2 GB for the 1 TB internal drive.