Trying out zswap

Now that I've got 64 GB of RAM to play around with, I'm trying out zswap.

I'm running swappiness at 200 (the maximum):
sudoedit /etc/sysctl.conf

# Swappiness
vm.swappiness=200

... and I'm running the nohang low-memory handler.

sudo add-apt-repository ppa:oibaf/test
sudo apt update
sudo apt install nohang
sudo systemctl enable --now nohang-desktop.service

I've set the priority of each swap drive to 1 (pri=1) in /etc/fstab:
<file system> <mount point> <type> <options> <dump> <pass>

UUID=21ebe95a-cdd6-40d1-b5a2-ff44a768b47d	none	swap	discard,noatime,pri=1	0	0

UUID=c39fbec2-4aa6-4255-b6c3-e9540b397713	none	swap	discard,noatime,pri=1	0	0

UUID=d4398e10-8183-4a5a-88a6-9d830a6f2a6d	none	swap	discard,noatime,pri=1	0	0

... so the drives are swapped to in round-robin fashion, to speed up the swapping.

zswap intercepts data that's bound to be swapped out to the swap drive(s), and compresses it, instead. Then, when that in-memory compressed swap pool is filled to the user-configurable percentage of total memory, it swaps it out to the swap drive(s).

Set up zswap:

sudo su
echo z3fold > /sys/module/zswap/parameters/zpool
echo 50 > /sys/module/zswap/parameters/max_pool_percent
echo lz4 > /sys/module/zswap/parameters/compressor
echo Y > /sys/module/zswap/parameters/enabled
echo z3fold >> /etc/initramfs-tools/modules
echo lz4 >> /etc/initramfs-tools/modules
update-initramfs -u
exit
exit

sudoedit /etc/default/grub
Add the part in bold:

GRUB_CMDLINE_LINUX_DEFAULT="noplymouth threadirqs preempt=full tsc=reliable numa=on nohz=1-11 rcu_nocbs=1-11 zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=50 zswap.zpool=z3fold"

sudo update-grub

Then reboot.

If one uses 50% of their memory (in this case, 32 GB), at 2:1 compression, that would give (32*2)+6=70 GB of swap space + 32 GB RAM = 102 GB of total space to work with, 32 GB more than it's actually got (64 GB RAM + 6 GB swap).

One would really have to seriously over-amp their memory usage to reach an out-of-memory condition... in which case nohang intervenes.

After I reboot, I'll run sudo nohang -m (the memory consumption test in nohang) to see how the system responds, and report back.

One can see whether zswap is enabled by:
dmesg | grep zswap

You should see something like:
[ 0.911797] zswap: loaded using pool lz4/z3fold

One can see the compression ratio after everything is set up and running by:
sudo bash -c 'echo "scale=2; " $(</sys/kernel/debug/zswap/stored_pages) " * 4096 /" $(</sys/kernel/debug/zswap/pool_total_size) | bc'
... although you'll get a 'divide by zero' runtime error if there is no compression.

One can see how it's working by:
sudo grep -R . /sys/kernel/debug/zswap/

[EDIT]
Huh... that didn't work at all... zswap didn't even compress any of the pages, nor did it swap to disk.

I even tried tail /dev/zero to consume all the memory... no compression of the pages, no swapping to disk.

Ok, I'm reverting the changes.

[EDIT 2]
Heh... I figured out why it's never swapping... it's because I'm a potato. :potato:

I'd configured sudoedit /etc/sysctl.conf vm.overcommit_ratio=85 and vm.overcommit_memory=2 so it could only take 85% of the total of memory and swap space... 64+6=70*0.85=10.5 GB. That's larger than the swap space, so the swap space is never touched.

I'm going to change that to vm.overcommit_ratio=99 and vm.overcommit_memory=2... that'll leave 700 MB of 'memory space' free (ie: it'll fill all of memory and all but 700 MB of swap space).

I did that on the recommendation of a Linux guru, so that in an OOM condition, the machine's still got a bit of memory to allow me the responsiveness to shut down a memory-gobbling application... and it apparently works. :crazy_face:

I'll update after I get everything done again.

[EDIT 3]
Before I re-enable zswap, I'm going to make sure I can actually swap data out to the swap drives.

sudoedit /etc/sysctl.conf

# Swappiness
vm.swappiness=200

# VM Settings
vm.compact_memory=1
vm.compaction_proactiveness=100
vm.overcommit_memory=2
vm.overcommit_ratio=99
vm.page-cluster=4
vm.zone_reclaim_mode=4
vm.watermark_scale_factor=125
vm.watermark_boost_factor = 15000
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.dirty_expire_centisecs = 1000
vm.dirty_writeback_centisecs = 250
vm.dirtytime_expire_seconds = 300

All of these settings are located at

/proc/sys/vm/

... you can either overwrite the files from sudo su with the necessary settings (for example: echo 125 > /proc/sys/vm/watermark_scale_factor), or put the settings in /etc/sysctl.conf.

1 Like

Ok, I got it to use the swap drives... I had to uninstall nohang to do it... it would shut down any memory gobbler at ~1.2 GB of space left.

Now I can max out both memory and swap space to 99%, so now I can try out zswap. Then I'll reinstall and configure nohang.

[EDIT]
Ok, I got it working:

sudo grep -R . /sys/kernel/debug/zswap/

/sys/kernel/debug/zswap/same_filled_pages:7889
/sys/kernel/debug/zswap/stored_pages:171393
/sys/kernel/debug/zswap/pool_total_size:243048448
/sys/kernel/debug/zswap/duplicate_entry:0
/sys/kernel/debug/zswap/written_back_pages:0
/sys/kernel/debug/zswap/reject_compress_poor:35
/sys/kernel/debug/zswap/reject_kmemcache_fail:0
/sys/kernel/debug/zswap/reject_alloc_fail:0
/sys/kernel/debug/zswap/reject_reclaim_fail:0
/sys/kernel/debug/zswap/pool_limit_hit:0

sudo bash -c 'echo "scale=2; " $(</sys/kernel/debug/zswap/stored_pages) " * 4096 /" $(</sys/kernel/debug/zswap/pool_total_size) | bc'
2.84

That means the machine now acts as though it's got:
32 GB * 2.84 = 90.88 + 32 GB RAM + 6 GB Swap = 128.88 GB of working space. That's just over double what it's actually got.

And I figured out that I don't really need nohang, I just have to tweak the settings in /proc/sys/vm/ (and reflected in /etc/sysctl.conf) so that the system issues a signal to kill to any memory-gobbling program.

I've got most everything set up the way I want it, I just have to change /etc/sysctl.conf vm.overcommit_ratio and /proc/sys/vm/overcommit_ratio from 99 to 95, and everything should be good. That'll kill a memory-gobbling program when total space reaches 3.5 GB free, so the machine should remain fairly responsive.

2 Likes