First release of Vitastor S3

2025-03-16

The moment has come - Vitastor S3 implementation based on Zenko CloudServer is finally released.

Key differences from the prototype:

Volume defragmentation is implemented;
Volume metadata may now be stored in the same MongoDB as object metadata, not just in VitastorKV;
Tests for the Vitastor S3 backend added;
S3 is now packaged in a convenient Docker build.

Highlights

Zenko CloudServer is implemented in node.js.
Object metadata is stored in MongoDB.
Modified Zenko CloudServer version is used for Vitastor. It is slightly different from the original, has an optimised build and unneeded dependencies are stripped off.
Object data is stored in Vitastor block volumes, but the volume metadata is stored in the same MongoDB, not in Vitastor etcd.
Objects are written to volumes sequentially one after another. The space is allocated with rounding to the sector size (4 KB), so each object takes at least 4 KB.
An important property of such storage scheme is that small objects aren’t chunked into parts in Vitastor EC N+K pools and thus don’t require reads from all N disks when downloading.
Deleted objects are marked as deleted, but the space is only actually freed during asynchronously executed “defragmentation” process. Defragmentation runs automatically in the background when a volume reaches configured amount of “garbage” (20% by default). Defragmentation copies actual objects to new volume(s) and then removes the old volume. Defragmentation can be configured in locationConfig.json.

Installation

Follow the documentation: https://vitastor.io/en/docs/installation/s3.html

Plans for future development

User account storage in the DB instead of a static file. Original Zenko uses a separate closed-source “Scality Vault” service for it, that’s why we use a static file for now.
More detailed documentation.
Support for other (and faster) key-value DBMS for object metadata storage.
Other performance optimisations, for example, related to the used hash function - MD5 used for Amazon compatibility purposes is relatively slow.
Object Lifecycle support. There is a Lifecycle implementation for Zenko called Backbeat but it’s not adapted for Vitastor yet.
Quota support. Original Zenko uses a separate “SCUBA” service for quotas, but it’s also proprietary and not available publicly.

Initial benchmarks

Tests below were conducted on a very small test cluster with 4 hosts and 1x Samsung PM9A3 on each of them, 1 zenko instance with 8 node.js worker processes and MongoDB replica set with 3 replicas installed on system SSDs.

hsbench from localhost was used for the benchmark.

16 threads, 4 KB objects:

./hsbench -a accessKey1 -s verySecretKey1 -u http://localhost:8000 -z 4k -t 16
... Dur(s): 60.0, Mode: PUT, Ops: 40721, MB/s: 2.65, IO/s: 678, Lat(ms): [ min: 8.9, avg: 23.6, 99%: 46.1, max: 627.1 ], Slowdowns: 0
... Dur(s): 60.3, Mode: LIST, Ops: 3939, MB/s: 0.00, IO/s: 65, Lat(ms): [ min: 92.6, avg: 244.2, 99%: 608.9, max: 919.8 ], Slowdowns: 0
... Dur(s): 60.0, Mode: GET, Ops: 163326, MB/s: 10.63, IO/s: 2722, Lat(ms): [ min: 2.4, avg: 5.8, 99%: 16.7, max: 31.4 ], Slowdowns: 0
... Dur(s): 37.6, Mode: DEL, Ops: 40721, MB/s: 4.23, IO/s: 1084, Lat(ms): [ min: 7.3, avg: 14.8, 99%: 26.9, max: 57.5 ], Slowdowns: 0

16 threads, 4 MB objects:

... Dur(s): 60.1, Mode: PUT, Ops: 14879, MB/s: 990.77, IO/s: 248, Lat(ms): [ min: 22.2, avg: 64.5, 99%: 139.2, max: 641.0 ], Slowdowns: 0
... Dur(s): 60.4, Mode: LIST, Ops: 3943, MB/s: 0.00, IO/s: 65, Lat(ms): [ min: 104.2, avg: 244.1, 99%: 564.5, max: 966.1 ], Slowdowns: 0
... Dur(s): 60.2, Mode: GET, Ops: 31415, MB/s: 2087.69, IO/s: 522, Lat(ms): [ min: 5.9, avg: 28.4, 99%: 230.9, max: 682.7 ], Slowdowns: 0
... Dur(s): 14.0, Mode: DEL, Ops: 14879, MB/s: 4264.17, IO/s: 1066, Lat(ms): [ min: 7.5, avg: 15.0, 99%: 27.0, max: 49.1 ], Slowdowns: 0

1 node.js worker process, 4 hsbench threads, 4 MB objects:

... Dur(s): 60.0, Mode: PUT, Ops: 3699, MB/s: 246.45, IO/s: 62, Lat(ms): [ min: 35.3, avg: 64.9, 99%: 85.0, max: 192.0 ], Slowdowns: 0
... Dur(s): 60.3, Mode: LIST, Ops: 856, MB/s: 0.00, IO/s: 14, Lat(ms): [ min: 126.0, avg: 281.1, 99%: 430.6, max: 484.0 ], Slowdowns: 0
... Dur(s): 60.0, Mode: GET, Ops: 6399, MB/s: 426.43, IO/s: 107, Lat(ms): [ min: 5.8, avg: 35.7, 99%: 259.2, max: 289.3 ], Slowdowns: 0
... Dur(s): 10.9, Mode: DEL, Ops: 3699, MB/s: 1355.97, IO/s: 339, Lat(ms): [ min: 8.1, avg: 11.8, 99%: 18.2, max: 25.6 ], Slowdowns: 0

Conclusion:

node.js performance is totally OK. Linear write performance is ~250 MB/s per process which is on par with Minio (implemented in Go) with GOMAXPROCS=1 (1 thread) running on the same host and mostly bounded by MD5 and SHA1 hash calculation.
Small object performance seems to be bounded by MongoDB. So it seems that MongoDB isn’t fast enough and we need another Key-Value DB backend. And that’s what we’ll explore during next releases! :)

Author & License

Zenko CloudServer author is Scality, licensed under Apache License, version 2.0
Vitastor and Zenko Vitastor backend author is Vitaliy Filippov, licensed under VNPL-1.1 (a “network copyleft” license based on AGPL/SSPL, but worded in a better way)
Vitastor S3 repository: https://git.yourcmc.ru/vitalif/zenko-cloudserver-vitastor
Vitastor S3 backend code: https://git.yourcmc.ru/vitalif/zenko-arsenal/src/branch/master/lib/storage/data/vitastor/VitastorBackend.ts