Change your i/o scheduler

The i/o scheduler assists the system in increasing data rate.

There are several i/o schedulers your machine can use (click 'Summary'):

Summary
  • bfq (Budget Fair Queuing) (Multiqueue):
    Designed to provide good interactive response, especially for slower I/O devices. This is a complex I/O scheduler and has a relatively high per-operation overhead so it is not ideal for devices with slow CPUs or high throughput I/O devices. Fair sharing is based on the number of sectors requested and heuristics rather than a time slice. Desktop users may like to experiment with this I/O scheduler as it can be advantageous when loading large applications.

  • kyber (Multiqueue):
    Designed for fast multi-queue devices and is relatively simple. Has two request queues:
    Synchronous requests (e.g. blocked reads)
    Asynchronous requests (e.g. writes)
    There are strict limits on the number of request operations sent to the queues. In theory this limits the time waiting for requests to be dispatched, and hence should provide quick completion time for requests that are high priority.

  • deadline (DEPRECATED):
    This fixes starvation issues seen in other schedulers. It uses 3 queues for I/O requests:

  • mq-deadline (Multiqueue):
    This is an adaption of the deadline I/O scheduler but designed for Multiqueue devices. A good all-rounder with fairly low CPU overhead.

  • sorted:
    Read FIFO - read requests stored chronologically
    Write FIFO - write requests stored chronologically
    Requests are issued from the sorted queue unless a read from the head of a read or write FIFO expires. Read requests are preferred over write requests. Read requests have a 500ms expiration time, write requests have a 5s expiration time.

  • cfq (Completely Fair Queueing) (DEPRECATED):
    Per-process sorted queues for synchronous I/O requests. Fewer queues for asynchronous I/O requests. Priorities from ionice are taken into account. Each queue is allocated a time slice for fair queuing. There may be wasteful idle time if a time slice quantum has not expired.

  • noop (No-operation) (DEPRECATED):
    Performs merging of I/O requests but no sorting. Good for random access devices (flash, ramdisk, etc) and for devices that sort I/O requests such as advanced storage controllers.

  • none (Multiqueue):
    The multi-queue no-op I/O scheduler. Does no reordering of requests, minimal overhead. Ideal for fast random I/O devices such as NVME.

You can check which i/o scheduler(s) your machine is using by:

cat /sys/block/sda/queue/scheduler
cat /sys/block/sdb/queue/scheduler
cat /sys/block/sdc/queue/scheduler
cat /sys/block/sdd/queue/scheduler

... etc.

Given that the commands above only show "none" and "mq-deadline" (the one with the brackets '[]' around it is the one being used), apparently those are all we have available as an i/o scheduler, but you can install kyber and bfq (click 'Summary'):

Summary

To use kyber, install the module:
sudo modprobe kyber-iosched
cat /sys/block/sda/queue/scheduler
and enable it:
echo "kyber" | sudo tee /sys/block/sda/queue/scheduler
You'll have to repeat the line above for all your drives, changing the '/sd?/' bit, then make the changes permanent (see below).

To use bfq, install the module:
sudo modprobe bfq
cat /sys/block/sda/queue/scheduler
and enable it:
echo "bfq" | sudo tee /sys/block/sda/queue/scheduler
You'll have to repeat the line above for all your drives, changing the '/sd?/' bit, then make the changes permanent (see below).

I had a mixture of "none" and "mq-deadline"... for two identical USB memory sticks being used for identical purposes (ZFS L2ARC cache), one was using "none", the other was using "mq-deadline"! Obviously the routine which automatically assigns an i/o scheduler doesn't work very well.

Given that ZFS has its own i/o scheduler, and the other drives are mainly storage (and thus don't need bleeding-edge i/o scheduling), and the simplest i/o scheduler means less CPU usage, I'm going to switch them all to "none".

Just to check that switching the i/o scheduler works:

sudo echo "none" | sudo tee /sys/block/sda/queue/scheduler
sudo echo "none" | sudo tee /sys/block/sdb/queue/scheduler
sudo echo "none" | sudo tee /sys/block/sdc/queue/scheduler
sudo echo "none" | sudo tee /sys/block/sdd/queue/scheduler
sudo echo "none" | sudo tee /sys/block/sde/queue/scheduler
sudo echo "none" | sudo tee /sys/block/sdf/queue/scheduler
sudo echo "none" | sudo tee /sys/block/sdg/queue/scheduler

That works... but it's temporary. As soon as you reboot, it goes back to what it was before. Let's change that. I'm going to enable "none" on all drives.

Create the rules file:
sudo touch /etc/udev/rules.d/90-iosched.rules

Edit the rules file:
sudoedit /etc/udev/rules.d/90-iosched.rules

In that file, add all the lines (changing only the bit with the scheduler you want... it's set to "none" (ATTR{queue/scheduler}="none") for all possible drives in the text below):

Summary
ACTION=="add|change", KERNEL=="sda", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdb", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdc", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdd", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sde", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdf", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdg", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdh", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdi", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdj", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdk", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdl", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdm", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdn", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdo", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdp", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdq", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdr", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sds", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdt", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdu", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdv", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdw", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdx", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdy", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sdz", SUBSYSTEM=="block", ATTR{queue/scheduler}="none"

You can just copy-n-paste all of the text in the Summary> section above, even if you don't have 26 drives.

And finally, edit grub (this older method is being deprecated in favor of the rules method above, but we'll include it here because it still works with some kernels):
sudoedit /etc/default/grub
Change:
GRUB_CMDLINE_LINUX=""
... to:
GRUB_CMDLINE_LINUX="elevator=none"

Then exit your editor and:
sudo update-grub

And finally, reboot.

After reboot, to verify that you're using the desired i/o scheduler on each drive:

cat /sys/block/sda/queue/scheduler
cat /sys/block/sdb/queue/scheduler
cat /sys/block/sdc/queue/scheduler
cat /sys/block/sdd/queue/scheduler

... etc.

Again, the i/o scheduler being used will be in [brackets].

1 Like