Vitastor 3.0.12 released

2026-05-17

Important fixes (except the new store)

Fixed a possible use-after-free in the OSD during error handling of initial commit/rollback of objects in EC pools.
Fixed a possible free of an invalid pointer in the OSD during read errors from snapshot/clone chains in EC pools.
Fixed possibly incorrect handling of commit/rollback operations in EC pools during pool PG count changes.
Fixed the inverted fsync enable parameter in the ublk driver (fsync was not enabled on pools without immediate_commit).
Added the raw-ls command for debugging purposes to find object versions in the cluster using listing operations.

New store fixes

Improved startup speed by using LSN-based sorting only for objects with a large number of intermediate versions.
Added skip_double_claim option as a temporary workaround to fix the rare OSD startup error with the “double claimed block” message, observed by several users. This option does not affect data integrity.
Fixed incorrect rechecking of small writes during startup, which in theory could lead to duplicate small write object entries on the OSD.
Fixed fsync operation for disks with a writeback cache (without capacitors):
- Fixed incorrect semantics of consecutive fsyncs (next fsync was not blocked by the previous one).
- Added fsync when copying small writes from the buffer to the data device (somehow forgotten during initial development).
- Added fsync after the initial garbage collection during OSD startup.
- Fixed incorrect cast of LSN from uint64 to uint32, breaking fsync when reaching LSN 2^32.
Added missing verification of the metadata header checksum during startup.
Fixed incorrect updating of object checksums in perfect_csum_update=true mode.
Fixed a possible OSD crash with “assertion failed” when processing a malformed EC STABILIZE operation.
Fixed the accounting of active compactor coroutines.
Removed broken and untested new->old store conversion support.

Minor issues fixed

Incorrect accounting of OSD local operation statistics in replicated pools.
Missing non-zero exitcodes on vitastor-disk resize command errors.
Missing reset of the list of inconsistent objects during PG restarts.
Theoretically possible hangs of various OSD operations when working with completely corrupted objects (without a single available copy), and possibly in some other very rare situations.
Incorrect fsyncs when deleting objects from pools without immediate_commit (on disks with a writeback cache), which previously could leave garbage when deleting misplaced objects.
Possible crash/memory corruption of the NFS server during a targeted attack on NFS-RDMA.
Possibly incorrect handling of ENOSPC/EIO write errors in replicated pools, leading to inability to retry the write later.
Possible crash instead of a clean error exit when starting an OSD with the old storage engine on a disk with corrupted journal data.
Shallow copying of PG configuration in the monitor, however, not related to actual bugs.
Incorrect checking of allocated blocks in the QEMU driver in an unused code branch (without the BDRV_WANT_ZERO flag).
Possible memory leak on read errors of corrupted objects.
Possible incorrect PG states when corrupted objects are detected.
Possible failure to mark all “bad” copies of an object during scrubs without checksums and with a large number of replicas (> 4).
Incorrect checksum calculation in the old storage engine when bitmap_granularity < 4096 (a practically unused configuration).
Theoretically possible OSD crash in rare cases during a scrub and simultaneous object recovery.
Theoretically possible OSD crash when handling PING operation errors.
Slightly suboptimal logic for reusing the RDMA send buffer.
Possible memory leak when canceling an already running scrub via no_scrub.
Possible memory corruption when a client (e.g., QEMU code) passes invalid buffers and the writeback cache is enabled.
Potentially incorrect search for corrupted parts of EC objects (inability to find a “good” combination) during a scrub with checksums disabled.
Possible additional memory usage on the OSD side when handling failed reads from snapshots (not a leak however - the memory was freed upon client disconnection).
Potential sudden write slowdown at certain pg epoch values due to incorrect epoch update logic in etcd.

Vitastor 3.0.12 released

Important fixes (except the new store)

New store fixes

Minor issues fixed

Links

← Next Post Previous Post →