Client Parameters

These parameters apply only to Vitastor clients (QEMU, fio, NBD and so on) and affect their interaction with the cluster.

client_iothread_count
client_retry_interval
client_eio_retry_interval
client_retry_enospc
client_wait_up_timeout
client_max_dirty_bytes
client_max_dirty_ops
client_enable_writeback
client_max_buffered_bytes
client_max_buffered_ops
client_max_writeback_iodepth
nbd_timeout
nbd_max_devices
nbd_max_part
osd_nearfull_ratio
hostname
ublk_queue_depth
ublk_max_io_size
qemu_file_mirror_path

client_iothread_count

Type: integer
Default: 0

Number of separate threads for handling TCP network I/O at client library side. Enabling 4 threads usually allows to increase peak performance of each client from approx. 2-3 to 7-8 GByte/s linear read/write and from approx. 100-150 to 400 thousand iops, but at the same time it increases latency. Latency increase depends on CPU: with CPU power saving disabled latency only increases by ~10 us (equivalent to Q=1 iops decrease from 10500 to 9500), with CPU power saving enabled it may be as high as 500 us (equivalent to Q=1 iops decrease from 2000 to 1000). RDMA isn’t affected by this option.

It’s recommended to enable client I/O threads if you don’t use RDMA and want to increase peak client performance.

client_retry_interval

Type: milliseconds
Default: 50
Minimum: 10
Can be changed online: yes

Retry time for I/O requests failed due to inactive PGs or network connectivity errors.

client_eio_retry_interval

Type: milliseconds
Default: 1000
Can be changed online: yes

Retry time for I/O requests failed due to data corruption or unfinished EC object deletions (has_incomplete PG state). 0 disables such retries and clients are not blocked and just get EIO error code instead.

client_retry_enospc

Type: boolean
Default: true
Can be changed online: yes

Retry writes on out of space errors to wait until some space is freed on OSDs.

client_wait_up_timeout

Type: seconds
Default: 16
Can be changed online: yes

Wait for this number of seconds until PGs are up when doing operations which require all PGs to be up. Currently only used by object listings in delete and merge-based commands (vitastor-cli rm, merge and so on).

The default value is calculated as 1 + OSD lease timeout, which is 1 + etcd_report_interval + max_etcd_attempts*2*etcd_quick_timeout.

client_max_dirty_bytes

Type: integer
Default: 33554432
Can be changed online: yes

Without immediate_commit=all this parameter sets the limit of “dirty” (not committed by fsync) data allowed by the client before forcing an additional fsync and committing the data. Also note that the client always holds a copy of uncommitted data in memory so this setting also affects RAM usage of clients.

client_max_dirty_ops

Type: integer
Default: 1024
Can be changed online: yes

Same as client_max_dirty_bytes, but instead of total size, limits the number of uncommitted write operations.

client_enable_writeback

Type: boolean
Default: false
Can be changed online: yes

This parameter enables client-side write buffering. This means that write requests are accumulated in memory for a short time before being sent to a Vitastor cluster which allows to send them in parallel and increase performance of some applications. Writes are buffered until client forces a flush with fsync() or until the amount of buffered writes exceeds the limit.

Write buffering significantly increases performance of some applications, for example, CrystalDiskMark under Windows (LOL :-D), but also any other applications if they do writes in one of two non-optimal ways: either if they do a lot of small (4 kb or so) sequential writes, or if they do a lot of small random writes, but without any parallelism or asynchrony, and also without calling fsync().

With write buffering enabled, you can expect around 22000 T1Q1 random write iops in QEMU more or less regardless of the quality of your SSDs, and this number is in fact bound by QEMU itself rather than Vitastor (check it yourself by adding a “driver=null-co” disk in QEMU). Without write buffering, the current record is 9900 iops, but the number is usually even lower with non-ideal hardware, for example, it may be 5000 iops.

Even when this parameter is enabled, write buffering isn’t enabled until the client explicitly allows it, because enabling it without the client being aware of the fact that his writes may be buffered may lead to data loss. Because of this, older versions of clients don’t support write buffering at all, newer versions of the QEMU driver allow write buffering only if it’s enabled in disk settings with -blockdev cache.direct=false, and newer versions of FIO only allow write buffering if you don’t specify -direct=1. NBD and NFS drivers allow write buffering by default.

You can overcome this restriction too with the client_writeback_allowed parameter, but you shouldn’t do that unless you really know what you are doing.

client_max_buffered_bytes

Type: integer
Default: 33554432
Can be changed online: yes

Maximum total size of buffered writes which triggers write-back when reached.

client_max_buffered_ops

Type: integer
Default: 1024
Can be changed online: yes

Maximum number of buffered writes which triggers write-back when reached. Multiple consecutive modified data regions are counted as 1 write here.

client_max_writeback_iodepth

Type: integer
Default: 256
Can be changed online: yes

Maximum number of parallel writes when flushing buffered data to the server.

nbd_timeout

Type: seconds
Default: 300

Timeout for I/O operations for NBD. If an operation executes for longer than this timeout, including when your cluster is just temporarily down for more than timeout, the NBD device will detach by itself (and possibly break the mounted file system).

You can set timeout to 0 to never detach, but in that case you won’t be able to remove the kernel device at all if the NBD process dies - you’ll have to reboot the host.

nbd_max_devices

Type: integer
Default: 64

Maximum number of NBD devices in the system. This value is passed as nbds_max parameter for the nbd kernel module when vitastor-nbd autoloads it.

nbd_max_part

Type: integer
Default: 3

Maximum number of partitions per NBD device. This value is passed as max_part parameter for the nbd kernel module when vitastor-nbd autoloads it. Note that (nbds_max)*(1+max_part) usually can’t exceed 256.

osd_nearfull_ratio

Type: number
Default: 0.95
Can be changed online: yes

Ratio of used space on OSD to treat it as “almost full” in vitastor-cli status output.

Remember that some client writes may hang or complete with an error if even just one OSD becomes 100 % full!

However, unlike in Ceph, 100 % full Vitastor OSDs don’t crash (in Ceph they’re unable to start at all), so you’ll be able to recover from “out of space” errors without destroying and recreating OSDs.

hostname

Type: string
Can be changed online: yes

Clients use host name to find their distance to OSDs when localized reads are enabled. By default, standard gethostname function is used to determine host name, but you can also override it with this parameter.

ublk_queue_depth

Type: integer
Default: 256

Default queue depth for Vitastor ublk servers.

ublk_max_io_size

Type: integer

Default maximum I/O size for Vitastor ublk servers. The largest of 1 MB and pool block size multiplied by EC data chunk count is used if not specified.

qemu_file_mirror_path

Type: string

When set to an FS directory path (for example, /mnt/vitastor/), qemu-img info and similar QAPI commands return the name of the image inside this directory instead of normal vitastor://?image=abc URI as filename.

This allows to then mount this path using vitastor-nfs and trick third-party systems like Veeam which rely on filename in the image info but don’t support Vitastor.