Fast and simple distributed
software-defined storage

Vitastor

  • Distributed SDS
  • ...but Fast 🚀 — latency 0.1 ms
  • ...and Simple ✌️ — 60k lines of code, not 1 mln
  • From Russia with love 🙂

Software-Defined Storage (SDS)

Software that assembles usual servers with usual drives into a single scalable fault-tolerant storage cluster with extended features

Why do IaaS/PaaS providers use SDS?

  • Scalability
  • Client data preservation
  • Reduced costs due to hyperconvergence
  • No need for server "micromanagement"
  • Different storage classes (SSD, HDD)

But most SDS are an overhead

Overhead

  • Q=1 — best possible latency
  • 4 kb write to SSD — 0.04 ms
  • 4 kb write to Ceph — from ~ 1 ms
  • Internal cloud SDS's – ± the same
  • 2400 % overhead !

Vitastor

  • ~ 0.1 ms replicated latency
  • ~ 0.2 ms erasure-coded latency
  • 3-8 GByte/s per client (VM)
  • ~ 400000 iops per client
  • Just ~1 CPU core per NVMe disk
  • Low (50%) rebalance impact

Vitastor — protocols

  • Block access (VM disks, containers):
    Kubernetes, Proxmox, OpenNebula, OpenStack and others
  • VitastorFS (NFS) — clustered POSIX FS
  • Object storage (S3) — based on Zenko CloudServer

Features

  • Replication, erasure codes (N+K)
  • All disk support: SSD/NVMe, HDD, SSD+HDD
  • Flexible data placement
  • RDMA / RoCEv2 support
  • Fast snapshots and clones
  • Fast checksums, data scrubbing
  • Monitoring
  • Kubernetes operator
  • And more...

Architecture

  • Symmetric distributed, Ø SPOF
  • Block base layer
  • Uniform load balancing
  • Transactional writes → data loss protection
  • Optimised for modern SSD/HDD

Ease of support

  • Low number of components
  • Human-readable metadata in etcd
  • Minimal external dependencies
  • Compact implementation:
    ~60k lines of code (Ceph ~1M)
  • Non-standard architecture support (E2K)

Licensing

  • Own Copyleft license: VNPL
  • Free use in OpenSource environment
  • Closed-source services require commercial support
  • Technical and architectural support from author

Contacts

Block Storage

With support for all major KVM-based systems and containers: OpenNebula, OpenStack, Proxmox VE, Kubernetes

Clustered File System

Ground-up implementation of a scalable POSIX
Read-Write-Many file system
, mountable over NFS 3.0

Object Storage (S3)

Based on Zenko CloudServer

Latest Posts

Vitastor 3.0.3 released

  • Fix csum_block_size > 0 unusable with atomic writes in the new store (almost all atomic write requests generated invalid checksums with csum_block_size > 0)
  • Fix monitor sometimes randomly failing to optimize PGs with the “problem is infeasible or unbounded” message due to not waiting to read full stdout of lp_solve
  • Remove reshard_abort optimisation to fix chunked resharding introduced in 3.0.2 possibly corrupting in-memory OSD state when handling multiple rapid PG count change requests
  • Fix removed inodes not disappearing from OSD statistics in the new store, leading to bloating of the statistics with old inodes
  • Increase test coverage for the new store and fix several minor bugs:
    • Enabling/disabling of used_for_app was recalculating inode space statistics incorrectly
    • Fix object metadata validation on OSD startup rejecting some correct sequences of entries, possibly leading to OSD being unable to start
    • Fix PG activation with EC failing in rare cases with EBUSY when requesting to commit an already committed write
    • Fix a theoretically possible metadata writeback issue on ENOSPC during commit
  • Fix monitor failing to optimize PGs in presence of a host with name convertible to js Number (like 04e278988710) :D
  • Fix vitastor-cli dd sometimes (rarely) truncating the image when writing to stdout
  • Fix a theoretically possible client connection object leak when io_uring is full
  • Fix a leak of RDMA-CM connection objects
  • Fix crashes with data_block_size < 32KB (useless setup, but anyway) (#113)

2026-02-08 Continue reading →

Vitastor 3.0.2 released

  • Antietcd is now officially safe to use: the release includes a fixed version of antietcd which passes Jepsen transaction serializability tests.
  • Fix a huge checksum bug in the old store: incorrect checksums for small initial writes. The bug affected writes of exactly csum_block_size (4k by default) into new (unallocated) objects and generated invalid checksums in the store for the written 4k block. Moreover, generation of these invalid checksums was very slow because it was calculating CRC32 for 4 GB of zeroes. If the block wasn’t then overwritten as a part of a larger write request it became unreadable even though the stored data was correct. The bug affected all versions since 1.0.0, or since 2.3.0 because vitastor-disk didn’t allow to enable checksums prior to 2.3.0 because of another bug. O:-)
  • Prevent OSD disconnections due to long blocking of event-loop caused by PG resharding (moving objects between old and new PGs in memory) when changing pool PG count or just on restart of an OSD with a large database (for example, with a filled 8 TB SSD). The issue should now be fixed because OSD now performs resharding in chunks with pauses between chunks.
  • Prevent pools stuck in paused state on an aborted PG count change.
  • Fix a possible OSD crash with “division by zero” when trying to handle an operation before pool PG count is applied to the in-memory store.

2026-01-25 Continue reading →

Small, but with transaction isolation: writing Jepsen tests for Antietcd

Since version 1.7.0, Vitastor has a built-in etcd replacement — Antietcd.

It’s implemented in Node.js and is very simple — it has just a couple thousand lines of code. It doesn’t implement all features of etcd, but it’s absolutely sufficient for a fully functional Vitastor cluster — all the essential features are present, and in some ways, it’s even better than etcd — for example, Antietcd allows to avoid storing “temporary” data on disk.

However, until recently, there was no answer to the question: can it really be used in production? Does it work correctly?

Below is the story of the search for an answer. A story with a happy ending :)

2026-01-22 Continue reading →

Vitastor 3.0.1 released

Important fixes

  • Disable RWF_ATOMIC by default because Linux incorrectly requires all atomic writes to be power-of-2-sized and length-aligned. Details: use_atomic_flag
  • Fix cross-pool snapshots not working at all - always reading old data after taking the snapshot
  • Fix level_placement (broken in 2.2.0)
  • Fix CAS write return values in the client library (broken in 2.4.4, also breaking unaligned writes in VitastorFS)
  • Fix VitastorFS possibly losing some of intersecting parallel unaligned writes
  • Prevent possible reads of the old data during unfinished intent writes in the new store
  • Tests added for all of above problems to prevent future regressions

2025-12-22 Continue reading →

Vitastor 3.0.0 released

A single new feature: the new log-structured metadata store implementation, described in the presentation from Moscow Highload’2025 (check it out here).

It’s now the default store for new OSDs. The support for the old store is also left in place, you can still choose it for new OSDs with vitastor-disk prepare --meta_format 2.

OSDs from previous versions with the old store format will also continue to operate just like before.

Some documentation: atomic_write_size, meta_format.

2025-12-06 Continue reading →

All Posts